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Chapter  1 


lOTRODUCTION 


The  ELIM-COMPLIP  System,  developed  by  the  General  Research  Corporation 
(GRC)  and  the  predecessor  organization.  Research  Analysis  Corporation  (RAC) , 
is  one  of  the  key  tools  used  for  Army  manpower  planning.  The  major  com- 
ponents of  the  system  are  ELIM  (^nlisted  j^oss  _Inventory  Model),  which  is 
used  to  monitor,  analyze,  and  make  the  official  forecast  of  enlisted 
losses,  and  CO.MPLIP  (computation  of  Manpower  programs  Using  Linear  pro- 
gramming), which  is  used  to  generate  the  official  Army  manpower  program, 
as  well  as  programs  used  to  evaluate  policy  alternatives.  A recent  addi- 
tion to  the  system  is  a Gains  Module,  with  the  capability  to  forecast 
immediate  reenlistments  and  the  available  quantities  of  various  user- 
defined  categories  of  supply-limited  no-prior-service  (NFS)  gains.  The 
capability  to  forecast  immediate  reenlistments  was  incorporated  into 
existing  modules  of  the  system.  However,  a separate  module  was  developed 
to  forecast  the  supply-limited  NFS  gains.  This  is  known  as  the  NFS  Gains 
Module  (NPSGM)  and  is  the  subject  of  this  document. 

The  ELIM-COMPLIP  System  has  been  documented  in  three  volumes.^  This 

documentation  is  currently  In  the  process  of  being  revised  into  four  volumes 

2 

to  include  the  recent  additions. 


^Holz,  B.  W. , et  al,  "The  ELIM-COMPLIP  System  of  Manpower  Planning 
.Models,  Three  Volumes,  General  Research  Corporation,  OAD-CR-18,  December 
1973. 

2 

Holz,  B.  W. , et  al , Two  New  Versions  of  the  ELIM-COMPLIP  System, 
Four  Volumes,  General  Research  Corporation,  OAD-CR-  , in  process  of 
being  published. 
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SUM>L\RY  OF  THE  ELIM-COMPLIP  SYSTEM 


General 

The  ELIM-COMPLIP  System  is  used  by  the  Manpower  Programs  Division 
(MPD)  of  the  Office  of  the  Deputy  Chief  of  Staff  for  Personnel  (ODCSPER) 
to  produce  the  official  Array  manpower  program,  as  well  as  to  generate 
programs  for  use  by  the  Department  of  the  Army  (DA)  and  Department  of 
Defense  (DOD)  in  the  examination  of  Army  manpower  policy  alternatives. 

The  manpower  program  is  a forecast  of  various  categories  of  Active 
Army  strength,  gains,  and  losses  and  the  Reserve  Enlisted  Program  (REP) 
of  entry  on  active  duty  for  training  (ADT)  . A manpower  program — 'which 
covers  each  month  of  tne  current  fiscal  year  (FY),  sometimes  the  imme- 
diately preceding  FY , and  from  two  to  six  future  FYs — reflects  the 
current  status  of  Army  manpower,  recent  past  experience,  and  plans  and 
assumptions  concerning  the  future. 

Inputs  to  the  system  include  a variety  of  historical  data,  a large 

part  of  which  is  input  from  other  automated  systems,  and  a number  of  user 

specifications.  Included  in  the  latter  are  the  following;  (a)  objectives 

(targets)  for  the  Army's  operating  strength;  (b)  any  applicable  limitations 

on  total  end  strength  and/or  man  years;  (c)  projections  of  officer  gains 
* 

and  losses  and  prior-service  (PS)  enlisted  gains;  (d)  specifications 
concerning  policies  governing  such  matters  as  enlistments,  reenlistments, 
extensions  of  terms  of  service,  and  various  types  of  early  release  for 
enlisted  personnel;  (e)  training  objectives  for  the  REP;  and  (f)  the 
programmed  capacity  of  the  training  base. 

A number  of  types  of  output  reports  and  graphical  displays  of  both 
input  and  output  data  are  available  to  assist  the  user  in  analyzing  the 
effect  of  postulated  policies  and  other  assumptions.  Outputs  from  the 
system  have  been  used  for  such  purposes  as  the  following:  (a)  decisions 

about  draft  calls  during  the  period  FY70  to  FY73;  (b)  evaluation  cf  the 

Jt 

There  is  an  option  that  permits  COMPLIP  to  compute  these  projections 
in  the  light  of  user  specifications  concerning  the  officer  force. 

During  most  of  this  period  only  the  COMPLIP  part  of  the  system  was 
in  existence. 


o 


tjffect  of  tlie  discontinuation  of  tlie  draft  .ind  determination  of  the 
requirements  for  volunteer  enlistments;  (c)  preparation  of  the  budget 
for  military  personnel  (MPA);  (d)  preparation  of  Program  Objectives 
Memorandum  (POM),  submitted  annually  to  the  Office  of  the  Secretary  of 
Defense  (OSD);  (e)  consideration  of  proposed  discharge  programs,  such 
as  that  used  currently  to  screen  recruits  during  the  first  six  months 
of  service;  and  (f)  planning  for  the  training  of  recruits. 

ELIM 

The  function  of  ELIM  is  to  produce  forecasts  of  enlisted  losses. 

ELIM  accomplishes  this  by  applying  loss  rates  to  the  strengths  of  corres- 
ponding elements  of  the  enlisted  population.  The  loss  rates  are  derived 
from  historical  data,  subject  to  user  modification,  when  desired,  to  reflect 
assumptions  concerning  the  effect  of  postulated  changes  in  policy  or  prac- 
tice from  that  reflected  in  the  historical  data.  For  certain  types  of 
loss — specifically  losses  associated  with  any  special  early  release 
policies — ELIM  relies  entirely  on  user-specified  factors. 

The  population  used  by  ELIM  is  a profile  of  the  enlisted  inventory 
derived  primarily  from  the  Enlisted  Master  File  (EMF) . The  objective  is 
to  use  as  a base  for  loss  projections  information  concerning  the  enlisted 
population  that  is  the  most  recent  available  and  that  describes  the  popula- 
tion in  terms  of  characteristics  that  can  be  expected  to  have  an  important 
influence  on  the  frequency  with  which  losses  of  various  kinds  occur. 

The  moael  makes  separate  projections  for  a number  of  different  causes 
of  loss  associated  with  a number  of  different  population  categories.  Losses 
are  grouped  into  categories  that  reflect  major  manpower  policies  and/or 
consist  of  components  that  are  relatively  homogeneous  in  the  way  they  vary 
with  time  and  with  the  population  variables  used  in  the  projection  process. 

The  three  versions  of  ELIM — ELIM-1,  ELIM-II,  and  ELIM-III — differ  with 

respect  to  the  breakouts  used  for  the  first  term  (FT)  Regular  Army  (R.\) 

populafon.  In  all  three  versions  the  RA  population  is  broken  out  by  FT 

3 

and  career,  where  the  distinction  is  that  of  the  DCSPER-46  report,  i.e., 
careerists  are  those  who  have  more  than  36  months  of  service,  with  all 
others  designated  FT.  In  ELIM-I  the  FT  population  is  broken  out  by  tern 
of  service  and  months  to  expiration  of  terra  of  service  (ETS) . 

3 

Dept,  of  Army,  "Strength  of  the  Army,"  DCSPER-46,  published  monthly. 
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In  ELIM-Il  a distinction  is  made  between  "first  timers"  (FTI) — 
those  who  are  serving  on  their  first  enlistment  contract — and  "second 
timers"  (STI ) — tiiose  who  have  either  reenlisted  or  extended  the  term  of 
service,  but  according  to  the  DCSPER-46  definition  are  still  designated 
FT.  The  FTI  are  broken  out  by  term  of  service  and  months  to  ETS , while 
the  STI  are  broken  out  only  by  months  to  ETS.  Tlie  breakout  of  FTI  vs 
STI  enhances  projection  accuracy  and  provides  additional  historical 
strength  and  loss  data  that  are  useful  for  other  applications. 

In  ELIM-III  there  is  a further  breakout  of  FTI  in  the  first  21  months 
of  service  into  a maximum  of  four  groups,  designated  characteristic  groups 
or  C-groups,  where  the  user  specifies  the  definition  of  these  classes  in 
terms  of  characteristics  such  as  age,  race,  sex,  civilian  education,  and 
scores  on  classification  tests.  For  example,  C-group  1 might  consist  of 
high  school  graduates;  C-group  2 of  those  classified  in  mental  group  1,  2 
or  3 who  have  not  graduated  from  high  school;  C-group  3 of  mental  group  4 
non-graduates  who  are  aged  18-20;  and  C-group  4 of  all  others.  It  is 
anticipated  that  the  user  will  vary  these  definitions  to  correspond  to 
specific  policies  chat  are  in  effect  or  under  consideration — e.g.,  con- 
straints on  the  input  of  high  school  graduates  or  those  classified  in  mental 
group  4.  Another  consideration  bearing  on  Che  user's  specification  of  C-group 
definitions  is  the  influence  of  certain  characteristics — e.g.,  civilian  educa- 
tion, race,  and  age — on  loss  projection  errors. 

COMPLIP 

The  function  of  COMPLIP  is  to  generate  an  optimal  manpower  program — 

i.e.,  a program  that  both  satisfies  all  of  the  user-specifications,  if  it 

is  feasible  to  do  so,  and  is  optimal  in  some  sense,  where  the  user  can 

★ 

exercise  some  choice  with  respect  to  Che  criterion  for  optimality  and  a 
wide  range  of  choice  concerning  constraints  on  the  manpower  program. 

Typically  the  model  operates  as  follows:  given  user  specifications  con- 

cerning such  matters  as  operating  strength  targets,  constraints  on  various 
kinds  of  strength — e.g.,  total  average  strength  (man  years)  and/or  total 
end  strength — and  plans  for  the  REP  and  the  training  base,  COMPLIP  determines 
the  monthly  levels  of  untrained  (i.e.,  NPS)  accessions  for  the  Active  Army 

★ 

Usual  practice  is  to  specify  two  or  more  criteria  to  be  used  in 
sequence . 


I 

I 

I 
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Chat  bring  projected  operating  strength  into  the  closest  possible  agree- 
ment with  the  monthly  targets.  Once  this  has  been  accomplished,  annual 
!\bP  entry  on  ADT  is  maximii'ed  in  such  a way  that  monthly  inputs  to  basic 
training  centers  are  smoothed  as  mucii  as  possible. 

Hie  new  version  of  COMl’LIP,  C0MPL1P-G2,  provides  a number  of  new 
options  with  respect  to  model  formulation.  The  most  important  of  these 
is  the  capability  to  deal  explicitly  with  the  breakout  of  FT  enlistees 
by  the  C-group  discussed  previously  in  connection  with  ELIM-III.  Thus, 
constraints  can  be  imposed,  as  aopropriate,  on  Che  projected  availability 
and  allowable  input  of  recruits  corresponding  to  each  C-group.  Further- 
more, loss  rates  applicable  to  each  C-group  are  applied  over  the  first  21 
months  of  service.  An  automated  Matrix  Generator  for  C0MPLIP-G2  facilitates 
the  tailoring  of  the  model  for  each  application. 

System  Linkages 

Automated  linkages  exist  between  the  varior  ‘ modules  of  the  system. 

To  facilitate  use,  each  type  of  user-supplied  data  must  be  input  to  the 
system  only  once.  When  the  same  data  element  is  required  by  more  than 
one  module  or  program  it  is  passed  automatically  from  one  to  the  other. 
Further,  when  a new  run  is  made,  the  only  inputs  that  must  generally  be 
supplied  are  those  that  differ  from  data  used  in  the  prec'  iing  run. 

THE  MPS  GAINS  MODULE 

The  categories  of  enlisted  gains  listed  in  Che  manpower  program  are 
given  in  Table  1,  with  FY75  monthly  averages  for  each  category.  On  the 
average,  there  were  approximately  25,000  gains  per  month.  About  63  percent 
were  NPS  gains  and  23  percent  immediate  reenlistments.  Reenlistments  within 
2 to  90  days,  reenlistments  after  90  days,  returns  to  military  control. 
Reserve  Components,  and  administrative  gains  accounted  for  Che  remaining 
14  percent. 

Design  specifications  were  developed  for  methods  of  projecting  each 

* 

of  these  gains  categories  except  Reserve  Components.  However,  in  con- 
sideration of  both  the  significance  of  the  category  and  the  estimated 


A discussion  of  these  design  specifications  is  contained  in  the 
Phase  I Report  of  Che  study,  "ELIM-COMPLIP  System  Improvement." 
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Level  of  effort  required  to  develop  the  forecasting  capability,  the 
Study  Advisory  Group  (SAG)  directed  that  development  be  limited  to 
programs  associated  with  the  two  most  numerous  types  of  gains,  NTS 
and  immediate  reenlistments. 


Table  1 

CATEGORIES  OF  ENLISTED  GAINS 


Category  1 

Monthly  average 
in  FY’75  1 

Percent  of 
total 

No-prior-service  (NPS) 

15,806 

62.5 

Immediate  reenlistment 

5,852 

23.1 

Reenlistment  within  2-90  days 

192 

0.8 

Reenlist  after  90  days 

1,493 

5.9 

Returns  to  military  control 

1,704 

6.7 

Reserve  components 

112 

0.4 

Administrative  gains 
(all  other) 

145 

0.6 

To  tal 

25,304 

100 

Figure  1 is  a system  schematic  depicting  conceptually  the  extension 

of  the  system  to  include  a Gains  Module.  Inputs  to  the  module  come  from 

the  ELIM  data  base,  augmented  by  some  additional  data  from  the  Modem 

* 

Volunteer  Army  (MVA)  master  file.  Outputs  of  the  Gains  Module  are  input 
to  the  ELIM  Inventory  Projection  Module  (IPM)  and  COMPLIP. 

A previous  GRC  study,  titled  "Evaluation  of  Army  Manpower  Accession 
Programs  (EAMAP) developed  a nonlinear  regression  system  to  forecast 
Armj’  volunteers  and  to  compute  the  corresponding  seasonal  coefficients — 
i.e.,  the  factors  reflecting  the  seasonal  patterns  of  enlistments  of 
various  groups — e.g.,  high  school  graduates  and  non-graduate  mental  groups 
1,  2,  and  3.  The  system,  which  is  operational  and  has  been  used  success- 
fully, has  been  adapted  for  incorporation  into  the  Gains  Module  for  use 
in  forecasting  categories  of  RA  NPS  gains  that  are  supply-limited. 

★ 

This  file  was  developed  by  another  GRC  study. 
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Projections  of  those  NPS  gains  that  are  demnd-limitecl — such  as  perhaps 
non-graduate  mental  group  4s — can  he  deterrained  by  means  of  the  meclian- 
isms  provided  in  C0MPL1P-G2. 

The  nonlinear  regression  is  based  on  historical  time  series  data 
on  enlistments  .ind  related  variables.  Forecasting  volunteers  in  a no- 
draft environment,  based  in  part  on  historical  data  when  the  draft  was 
in  effect,  requires  that  the  historical  data  that  are  used  be  restricted 
to  true  volunteers.  Several  methods  have  been  developed  to  separate  the 
enlistments  during  the  period  of  the  draft  into  draft-induced  and  true 
volunteers.  Tiie  so-called  "GRC  maximum"  method  (to  be  discussed  later) 
is  being  used  to  estimate  the  true  volunteers. 
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Chapter  2 


SYSTEM  DESCRIPTtOM 


Tlie  function  of  the  NPSGM  is  to  provide  a means  by  which  forecasts 
of  RA  NFS  gains  can  be  made  and  the  co rrespond ing  seasonal  coefficients 
can  be  computed.  Tlie  seasonal  coefficients  ire  factors  roflecting  the 
seasonal  patterns  of  enlistments  of  various  groups — e.g.,  high  school 
graduates,  and  non-graduate  mental  groups  I,  II,  and  III.  The  method 
used  is  that  of  nonlinear  multiple  stepvis.*  regression.  This  method 
was  developed  by  GRC  in  a previous  study. ^ It  consists  of  a nonlinear 
portion  used  to  determine  seasonal  factors  and  a linear  portion  used  to 
determine  the  regression  coefficients  for  the  independent  or  explanatory 
variables  reflecting  policies,  programs,  and  economic  and  other  environmental 
conditions.  The  linear  portion  uses  the  BMD02R  linear  multiple  stepwise 
regression  program  talcen  from  the  statistical  package  compiled  by  W.  J. 

Dixon  of  the  University  of  California  in  Los  Angeles  (UCLA).  The  package 
is  kno'wn  as  the  UCLA  Biomedical  Computer  Programs.^ 

The  procedure  for  forecasting  RA  NPS  gains  consists  of  four  major 
steps.  These  are  shown  schematically  in  Fig.  2.  The  first  step  is  the 
development  of  historical  time  series  of  250  categories  of  RA  true  volun- 
teers defined  on  the  basis  of  such  characteristics  as  sex,  civilian 
education,  and  race.  The  population  breakouts  are  maintained  for  subse- 
quent regression  runs.  The  second  step  is  to  prepare  the  regression  Input 
_ 

Grissmer,  D.  W. , et  al,  "An  Evaluation  of  Army  Manpower  Accession 
Programs,"  General  Research  Corporation,  April  1974. 

^W.  J.  Dixon,  "Biomedical  Computer  Programs,"  University  of  California 
Publications  in  Automatic  Computation  No.  2,  University  of  California 
Press,  1970. 


9 


Used  only  In  Che  Initial  frequency  file  generation 


I 

I 

( 


bv  aggregating  tlio  detailed  population  breakouts  into  the  desired  set 
of  subgroups,  adding  the  time  series  of  the  selected  explanatory  (or 
independent)  variables,  and  compiling  control  cards  for  the  regression 
program.  The  third  step  is  to  exercise  the  regression  model  in  order 
to  produce  the  forecasts  for  each  of  the  desired  set  of  population  sub- 
groups. The  fourth  step  is  to  aggregate  the  forecasts  of  the  subgroups 
into  at  HK)st  four  groups  referred  to  as  characteristic  groups  or  C-groups 
used  by  COMPLIP,  compute  seasonal  coefficients  for  each  of  the  C-groups 
based  on  the  aggregated  forecasts,  and  produce  printer  graphs  of  the 
historical  time  series  and  of  the  time  series  computed  by  the  regression 
equation  for  both  the  historical  time  frame,  and  12  months  of  forecast. 
Graphs  are  produced  for  each  of  the  population  categories  for  which  a 
regression  run  was  made. 

DEVELOPMENT  OF  HISTORIC.iL  DATA 

The  true  volunteers  during  the  draft  era  are  estimated  using  the 
"GRC  maximum"  method.  This  method  stems  from  the  relationship  between  the 
number  of  enlistees  during  the  draft  era  and  the  Lottery  sequence  numbers 
(LSNs) , as  shown  in  Fig.  3.  The  graph  confirms  the  conviction  that  many 
of  the  enlistees  volunteered  because  of  draft  pressure.  Note  the  leveling 
off  of  the  graph  after  LSN  240.  This  characteristic  formed  the  basis  for 
concluding  that  volunteers  with  LSN  greater  than  240  were  true  volunteers, 
since  they  were  in  little  or  no  danger  of  being  drafted. 

A precise  formulation  of  the  estimating  relationship  is  given  by  the 
following  equation: 

"t  = ^JO  °wo  ^ ^ 

where  equals  the  estimated  number  of  true  volunteers,  E^^equals  the 
number  of  RA  volunteers  without  LSNs,  equals  the  number  of  volunteers 

for  the  draft  without  LSNs,  E equals  the  number  of  RA  volunteers  with 
LSNs  241  to  366  and  D equals  the  number  of  volunteers  for  the  draft  with 
LSNs  241  to  366.  The  rationale  behind  this  estimating  relation  is  as 
follows.  Volunteers  with  LSNs  in  the  range  241  to  366  are  considered  to 
be  true  volunteers.  The  number  of  LSNs  in  this  range  is  126,  while  the 
total  number  of  LSNs  is  366.  The  ratio  366/126  is  used  in  Eq  1 for  the 


11 


f 

I 

I 

( 


m 


[ 


t • 


following  reasons:  (a)  since  LSNs  are  drawn  randomly,  individuals  with 

LSNs  241  to  366  are  a random  sample  of  all  individuals  with  LSNs;  and 
(b)  the  total  expected  number  of  true  volunteers  of  individuals  with  LSNs 
includes  a proportionate  numljer  with  LSNs  in  the  range  1 to  240.  .\11 

enlistees  without  LSNs  are  assumed  to  be  true  volunteers. 

Tlie  estimating  procedure  is  used  to  tabulate  true  volunteers  for 
e.ich  month  through  June  1973.  After  June  1973  the  accessions  are  tabulated 
directly.  flie  frequency  tabulations  are  made  for  each  cell  of  two  popula- 
tion partitions,  given  in  Table  2,  for  each  month  for  which  historical  data 
exists.  Thusi  a time  series  is  formed  for  each  population  cell.  Partition 
1 includes  categories  1 through  196,  where  the  breakouts  are  based  on  sex, 
civilian  education,  race,  mental  group,  age,  and  bonus.  Partition  2 includes 
categories  197  to  250,  where  the  breakouts  are  based  on  term  of  service, 
civilian  education,  race,  and  mental  group.  Both  partitions  are  mutually 
exclusive  and  exhaustive  partitions  of  the  population.  The  sources  for 

•k  kk 

these  data  are  the  .MVA  master  file  for  the  draftee  true  volunteers  and 
the  ELIM-III  cohort  file  for  the  RA  true  volunteers.  Table  3 gives  the  IT/A 
master  file  format  and  Table  4 gives  the  ELIM-III  cohort  file  format.  Monthly 
additions  to  the  volunteer  time  series  data  base  can  be  made  by  means  of  an 
update  progr.am  that  uses  the  Gains  transaction  file  as  input.  The  format 
of  the  Gains  transaction  file  is  given  in  Table  5. 

REGRESSION  INPUT 

The  second  step  in  this  forecasting  procedure  is  to  prepare  the  input 
for  the  regression  program.  This  is  done  by  means  of  a regression  input 
generator  that  produces  a file  containing  control  cards  for  the  regression 
program,  and  the  time  series  for  each  dependent  (i.e.,  volunteer  category) 
and  independent  or  explanatory  variable  to  be  used  in  the  projection  process. 
In  order  to  form  the  time  series  for  the  desired  volunteer  categories,  the 
user  specifies  the  categories  of  Table  2 to  be  aggregated.  The  aggregations 
are  based  either  on  partition  1 or  partition  2,  not  both. 

* 

This  file  was  created  under  a previous  GRC  study. 

kk 

Extracting  these  data  from  the  M\'A  .Master  File  is  a one-time  operation. 
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CATECORIES  KOR  NPS  «AINS  KREQUENCT  DISTRIBUTIONS 


Table  ^ 


MVA  >b\STER  FILE  FOR>L\T 


I.  MVA  ILISTER  FILE 

A.  Density  - 800  BPl 

B.  Mode  - BCD 

C.  Record  size  - 120  characters 

D.  Blocking  factor  - 40  records/block;  4800  characters/blocl: 

E.  Label  records  - none 

F.  Parity  - even 

II.  DATA  ELEMENTS 


Variable  name 

Chars 

Pos 

IVpe 

Comments 

1. 

5-digit  ZIP  code 

5 

1-5 

N 

No  blanks 

2. 

Date  of  birth* 

6 

6-11 

N 

No  blanks 

3. 

Branch  of  service 
* 

1 

12 

N 

From  ACC  or  DEP- 
Branch-Service 

4. 

Lottery  category 

1 

13 

N 

5. 

Lottery  number" 

3 

14-16 

N 

Zero  fill 

6. 

Term  of  enlistment 

1 

17 

A 

7. 

Enlistment  option  code 

4 

18-21 

A 

8. 

Type  entry 

* 

1 

TO 

N 

If  ACC  date=0, 
then  9 

9. 

Accessioji  date 

6 

23-28 

N 

Zero  fill 

10. 

DEP  date 

6 

29-34 

N 

Zero  fill 

11. 

AFQT  category 

1 

35 

N 

Zero  fill 

12. 

AFQT  score 

3 

36-38 

N 

Zero  fill 

13. 

Race 

1 

39 

N 

1-7 

14. 

Education 

1 

40 

A 

15. 

Number  of  dependents 
(not  counting  self) 

1 

41 

N 

0-9 

16. 

Last  service 

1 

42 

A 

17. 

Sex 

1 

43 

A 

M or  F 

18. 

State 

2 

44-45 

A 

19. 

Blank 

3 

46-48 

A 

20. 

Blank 

2 

49-50 

A 

21. 

Marital  status 

1 

51 

A 

22. 

Lottery  location 

1 

52 

N 

1 = 000-60 
2 = 61-120 

3 - 121-180 

4 = 181-240 

5 = 241-300 

6 = 301-366 
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Table  3 (cont'd) 


Variable  name 

Ciiars 

Tos 

Type 

Comments 

23.  RMS 

3 

53-55 

A 

24.  Training  corainitment 

56-59 

A 

25.  Waiver 

1 

60 

A 

26. 

SSN 

9 

61-69 

M 

Specifications  on  temporary 
SSN  are  currently  a mystery 

2 7 . 

Status  code 

I 

70 

A 

extracted  from  DEP  or  ACC 
status  code. 

2S. 

AQB 

30 

71-100 

.N 

999  = no  test  scores  avail- 
able, Jan  70  through  Mar  73 
use  psn  71-91. 

29. 

Recruiter  code  ^ 

7 

101-107 

A 

30. 

Transaction  date 

6 

108-113 

N' 

Year/month/day  either  ACC 

it 

date  or  DEP  date. 

31. 

Age 

y 

114-115 

N 

Attained  age  in  years  (not 
rounded)  at  DEP  or  ACC  date 

32. 

Religion 

2 

116-117 

A 

.ACC  date. 

33. 

Blank 

3 

118-120 

A 

Validity  checks  have 

be  en 

performed  on 

these 

fields. 
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Table 


EUM  m COHORT  FILE  FORFUT'’' 


I 

I 

I 


I 

f 

I 
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'•'ield  description 

Characters 

Code  description 

SS.\N  or  TIN' 

1 - 9 

Fil le  r 

10  - 10 

Date  of  birth 

11-16 

(YYMMDD) 

/\i'QT  score 

17  - 18 

99-invalid 

Se.x 

19  - 19 

M,  F,  9-invalid 

Race 

20  - 20 

1-white,  2-black,  3-other; 
9-invalid 

Term  of  enlistment 

21  - 21 

9- invalid 

Training  commitment 

22  - 25 

4 char  MOS  (not  edited) 

Civilian  education 

26  - 26 

9-invalid;  0 to  8-no  HS ; A,B,C,D- 
sorae  HS ; Y or  F-G'ED;  other-HS 
graduate 

Mental  category 

27  - 27 

1,2, 3, 4, 5;  9-invalid 

Enlistment  options 

28  - 31 

1st  char:  1-combat  arms,  2-service 

schools,  3-cther,  4-R.A  (unassigned) 
option ; 

1st  char:  B-bonus 

AQB  score  summary’ 

32-41 

each  character  represents  a test 
score  0-invalid;  1,2-score  < 90; 
3,4  - (90,99);  5,6,7  - (100,109); 
8,9  - score  = 110 

AQB  category 

42  - 42 

Lottery  number 

43  - 45 

999-invalid 

Number  of  dependents 

46  - 46 

9- invalid 

Marital  status 

47  - 47 

Moral  waiver 

48  - 48 

Age  in  months 

49-51 

999-invalid 

Number  of  transactions 

52  - 53 

01  - 12 

Dace  of  transaction** 

54  - 57 

(YYMM) 

Type  of  transaction** 

58  - 60 

GHF  - NFS  gain 

131  - 13A 
135  - 137 


[ 


The  codes  on  this  file  have  been  checked  for  validity. 

Date  and  type  of  transaction  codes  may  be  repeated  up  to  a total  of 
12  sets. 
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Table  5 


GAIN'  ;1ND  EXTENSION  RECORD  F0!0!i\T 
File  Identification:  BE  ME  0013 

Record  Identification:  Cains  Transactions/E;<tensions  Transactions 

DCSPER-46,  Part  II  (Edit) 


Relative 
pos i tions 

Name  of  data  element 

1 No.  of 

1 characters 

Type  of 
1 ciiaracters 

1-15 

Name  personnel 

15 

M 

lb- 17 

Sending  PPA  code 

2 

M 

18 

Bliink 

1 

M 

19-20 

Service  number  prefix 

2 

A 

2 1-2R 

Social  security  account  number 

9 

M 

30-32 

Grade  in  which  serving  (abbr) 

3 

M 

33 

Grade  in  which  serving 

1 

M 

34-36 

Months  service  for  pay 

3 

N 

37-4  1 

Primary  Military  occupational  specialty  5 

M 

42 

Race 

1 

A 

43 

Service  component 

1 

A 

44 

Term  of  service  or  enlistment 

1 

N 

45-46 

Current  assignment 

2 

M 

4 7-49 

Blank 

3 

M 

50 

Previous  regular  Army  service 

1 

M 

51-52 

Year  service  code  Generated 

2 

M 

53-56 

Expiration  of  term  of  service  (yr  mo) 

4 

N 

57-60 

Parent  unit  and  morning  report  indicator  4 

M 

61-62 

Sub  unit  code 

2 

M 

63-64 

Type  transaction 

2 

M 

65-70 

Transaction  date  (yr  mo  da) 

6 

N 

71 

Assignment  code  Generated 

1 

M 

72 

ETS  PETS  code  Generated 

1 

N 

73 

Reception  station  code 

1 

M 

74-76 

Separation  program  number  (previous) 

3 

M 

77-80 

Process  yr  mo  Generated 

4 

N 

81-82 

Category  code  Generated 

2 

N 

83-86 

Basic  active  service  (yr  mo) 

4 

A 

87-88 

Status 

2 

M 

89 

Civilian  education  code 

1 

M 

90 

Mental  group 

1 

N 

91-92 

Armed  forced  qualification  test  score 

O 

N 

93-98 

Date  of  birth  (yr  mo  da) 

6 

N 

99 

Sex 

1 

A 

100-101 

Receiving  PPA  code 

2 

M 

102-105 

Enlistment  options 

4 

>! 

106-109 

Basic  pay  entry  (yr  mo) 

4 

N 

110 

Dual  service  component  status 

1 

A 

I 
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Table  5 (cont'ci) 


Relative 

positions 

Marne  of  data  element 

No . of 
characters 

Type  of 
characters 

111 

Special  personnel  category 

1 

A 

112-115 

ETS  of  previous  service  (yr  mo) 

4 

N 

116 

Pro  pay 

1 

117-130 

Blank 

131-132 

PMOS  evaluation  score 

2 

N 

133 

Overseas /CON US  code 

1 

M 

134 

Military  personnel  class 

1 

A 

135 

Record  mark 

1 

M 

( 

^.'vn  A means  alphabetic,  N means  numeric;  M means  Chat  the  field  may 
contain  any  characters. 

i 

I 

[ 

V 

I 


i 
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Tlu;  NTSGM  pormits  die  user  to  make  aggregations  for  a naxinum  of 
20  volunteer  categories  that  are  used  as  dependent  variables.  It  is 
not  necessary  that  these  aggregate  categories  be  either  mutually  exclusive 
or  exhaustive.  The  user  will  be  provided  with  a message  listing  any 
duplicated  or  omitted  categories.  However,  for  the  purpose  of  generating 
input  for  the  TLIM-COMTLIP  System,  it  is  necessary  that  the  set  of  dependent 
variables  be  mutually  e.xclusive.  On  the  other  hand,  they  will  not  in 
gener;il  be  exhaustive,  since  regression  will  be  used  to  project  only  those 
NPS  gains  categories  that  are  supply  limited.  An  example  of  a mutually 
exclusive  but  not  exhaustive  set  of  dependent  variables  is  given  in  Table  6. 

Table  6 

E:G\MPLE  of  GITEGORIES  FOR  WHICH  REGRESSION  RUNS 
CAN  BE  MADE 

(Aggregated  from  Population  Partition  I) 

1.  Females 

2.  Male,  HSG,  Black,  Cat  I,  II,  III 

3.  Male,  HSG,  Black,  Cat  IV,  V 

4.  Male,  HSG,  Not  Black,  Cat  I,  II,  III 

5.  Male,  HSG,  Not  Black,  Cat  IV,  V 

6.  Male,  NHSG,  Black,  Cat  I,  II,  III 

7.  .Male,  NHSG,  Not  Black,  Cat  I,  II,  III 

The  forecasts  for  each  dependent  variable  are  based  on  selected 
independent  variables.  Examples  of  independent  variables  are  military  pay, 
unemployment  rate  among  the  16-21  year  old  male  out-of-school  labor  force, 
and  advertising  expenditures  for  accessions.  The  file  of  independent  vari- 
able data  shown  in  the  schematic  of  Fig.  2 contains  historical  values  for 
a set  of  independent  variables  that  have  proved  useful  in  the  projection 
of  volunteer  enlistments  in  other  GRC  studies.  A list  of  such  variables 
is  shown  in  Table  7.  Historical  values  of  each  of  the  independent  variables 
have  been  obtained  from  previous  GRC  studies.  Values  of  these  variables 
must  be  kept  current  by  the  user.  For  the  12  months  for  which  forecasts 
will  be  made,  the  user  has  the  option  of  specifying  the  values  for  the 
corresponding  independent  variables  or  of  having  the  program  extend  the 
last  available  value  to  the  end  of  the  projection  period. 


20 


Table  7 


DliFINITIONS  OF  INDFPENDENT  V;\RIABLES 


APLCT’  A dummy  variable  that  is  set  during  the  time  period  when 

recruiter  credit  was  not  given  for  Cat  III  non-high 
school  enlistees  and  is  0 otherwise.  The  phasedown  of 
the  values  represent  the  gradual  withdrawal  of  the  policy. 

AQOTDM  'Hie  variable  is  the  number  of  males  to  be  recruited  by 

the  Army.  The  quota  is  set  by  the  Department  of  the  Army. 

BNSHSG  A dummy  variable  that  is  set  to  1.0  each  month  in  which 

the  $1500  enlistment  bonus  is  in  effect  for  high  school 
graduates . 

BN’S  INC  A dummy  variable  that  is  set  to  1.0  each  month  in  which 

the  $1000  incremental  bonus  Is  in  effect  for  enlistees. 

it  __ 

BNSKLI  A dummy  variable  that  is  set  to  1.0  for  May  and  June  1973 

and  June  1974  and  subsequent  inonths  in  which  a bonus  was 
in  effect  for  skills. 

BNSKL2  A dummy  variable  that  is  set  to  1.0  each  month  beginning 

in  June  1974  in  which  a bonus  is  in  effect  for  skills. 

BNSNUS  A dummy  variable  that  is  set  to  1.0  each  month  in  which 

the  $1500  enlistment  bonus  is  in  effect  for  non-high  school 
graduates . 

CAOPTS  Number  of  combat  arms  options  available. 

CAT4LM  The  percent  of  NFS  enlistments  permitted  as  mental 

Category  IV. 

DUNEMP  Deseasonalized  unemployment  rates  for  16-21  out-of-school 

males . 

HSPL74  A dummy  variable  that  is  set  to  one  when  maximum  HS  grad 

policy  is  in  effect.  The  withdrawal  of  the  policy  is 
modeled  by  a ramp  since  a period  of  time  is  required 
for  the  recruiter  to  develop  the  non-HS  market  again 
(in  test  set  equal  to  1.0  in  April  73  with  a ramp  before 
and  after  this  month). 

*lt  Is  recommended  that  the  bonus  for  skills  for  June  1974  and  subse- 
quent months  be  reflected  by  BNSKL2  only  and  the  corresponding  values  for 

BNSKLI  be  set  to  zero. 
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Table  7 (cont'iJ) 

The  ratio  of  military  RMC  ( re^;uiar  military  compensation 
for  grade  E- 1 to  the  civilian  avera;^e  weekly  wages  for 
two  industries — Wholesale  and  Retail  Trade  Services. 

The  number  of  combat  arms  .and  service  schools  options, 
it  measures  the  number  of  new  separate  options  avail- 
able to  an  incoming  recruit  to  the  Army. 

This  variable  measures  the  number  of  paid  TV  advertise- 
ments sponsored  by  the  Army  during  their  paid  TV  and 
radio  advertising  campaign.  Source  for  the  data  is 
"Effectiveness  of  the  Modem  Volunteer  .Army  .Advertising 
Program,"  prepared  by  Stanford  Researcc.  Institute  for 
OS,\MVA,  December  1971. 

Dummy  variable  designed  to  reflect  the  impact  of  a new 
pay  raise. 

The  number  of  media  insertions  placed  in  national  circu- 
lation magazines  plus  the  number  of  national  newspaper 
campaigns . 

Number  of  Army  recruiter  assistants  assigned  to  Hometown 
Recruiter  Assistant  Plan  each  month. 

Number  of  Army  recruiters  on  production  each  month. 

A variable  measuring  increasing  time. 

A dummy  variable  reflecting  when  options  are  available 
for  two  year  enlistees  who  are  high  school  graduates. 

A dummy  variable  reflecting  when  options  are  available 
for  two  year  enlistees  who  are  non-high  school  graduates 

Number  of  Army  unit  of  choice  canvassers. 


Sometintis  it  is  desir.ibltj  Co  use  independent  variables  that  have  been 
lagged  by  a tew  months  to  reflect  the  possibility  of  a delayed  effect  on 
the  volunteers.  The  program  permits  up  to  nine  copies  of  each  variable, 
incorporating  the  desired  lag.  Copies  of  a variable  including  the  variable 
itself  are  designated  a family.  A limit  can  be  imposed  on  the  number  of 
members  of  a family  that  can  be  in  the  regression  solution  at  any  one  time. 
In  practice,  this  limit  has  usually  been  one. 

Provision  is  made  for  a sign  (i.e.,  plus,  minus,  or  blank  for  either) 
to  be  associated  with  each  independent  variable  to  control  the  sign  of  its 
coefficient  upon  entering  the  regression  equation.  For  most  variables 
except  time  use  of  this  option  is  desirable  because  the  objective  is  not 
only  Co  obtain  a good  f.t  of  the  equation  to  Che  lii^torical  data  but  also 
Co  obtain  an  accurate  forecast  12  months  into  the  future.  Without  sign 
control  a good  fit  may  be  obtained  but  a trend  mav  be  established  berween 
independent  and  dependent  variables  that  has  an  adverse  effect  on  the  pro- 
jected values.  With  sign  control  a logioally  sound  model  can  be  constructed 
that  establishes  Che  appropriate  type  of  correlation — i.e.,  positive  or 
negative — between  dependent  and  inaependent  variables.  For  example,  the 
variable  reflecting  a bonus  for  high  school  graduates  should  have  a plus 
sign  when  used  with  Che  dependent  variable  that  represents  high  school 
graduates.  An  example  with  a negative  sign  restriction  is  the  case  of 
using  Che  independent  variable  "high  school  policy  74"  with  dependent 
variables  representing  not-high  school  graduates.  Since  the  policy  was 
designed  to  maximize  the  intake  of  high  school  graduates,  it  tended  to 
minimize  Che  not-high  school  graduate  accessions  and  hence  the  variable 
reflecting  this  policy  is  negatively  correlated  with  the  dependent  vari- 
ables reflecting  not-high  school  graduates. 

For  each  dependent  variable  a set  of  independent  variables  that  are 
candidates  to  enter  the  regression  equation  must  be  specified.  Examples 
of  dependent  and  independent  variable  combinations  are  given  in  Table  8. 

The  reason  the  user  specifies  the  independent  variables  is  to  eliminate 
those  that  are  not  applicable  for  the  given  dependent  variable.  Some 

independent  variables  are  obviously  not  applicable  for  a given  dependent 

* 

variable  and  experience  may  teach  that  certain  other  independent  variables 

lit 

For  example,  it  is  obvious  that  the  independent  variable  representing 
the  combat  arms  bonus  is  not  applicable  to  the  dependent  variable  repre- 
senting females. 
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Table  3 


EX.\MPLES  OF  DEPENDENT  .\ND  COF^RESPONDINC  INDEPENDENT  VARIABLES 


Independent  variables 

1 

Mneraon ics 

Sign  [ 
restric-  I 
ti  on  1 

Dependen  t 

variables 

HSG 

black 

1 

HSG  1 

not  blackj 

Mot  HSG 
black 

Not  HSG 
not  black 

MLlitar>’  civilian  pay  ratio 

MI Cl  PA 

+ 

X 

X 

X 

X 

Recrui ce  rs 

RECR 

+ 

X 

X 

V 

X 

Recruiters  (-1) 

RECR 

+ 

X 

X 

X 

X 

Recruiters  (-Z) 

RECR 

+ 

X 

X 

X 

Recruiter  assistants 

RE CASS 

+ 

X 

X 

X 

V 

Recruiter  assist;mts  (-1) 

RE CASS 

+ 

X 

X 

X 

X 

Recruiter  assistants  (-2) 

RE CAS 3 

+ 

X 

X 

X 

X 

Unit  ot  choice  canvassers 

UOCC.VN 

+ 

X 

X 

X 

X 

Unit  of  choice  canvassers  (-1) 

UOCC.LN 

+ 

X 

X 

X 

X 

Unit  of  choice  canvassers  (-2) 

UOCCAN 

+ 

X 

X 

X 

X 

Conbat  arms  options 

CAOPTS 

+ 

X 

X 

Combat  arms  options  (-1) 

CAOPTS 

+• 

X 

X 

Combat  arras  options  (-2) 

CAOPTS 

+ 

X 

X 

Bonus  - HSG 

BNSHSG 

+ 

X 

X 

Bonus  - NTISG 

BNSNUS 

+ 

X 

X 

Bonus  - Increment 

BNSINC 

+ 

X 

X 

Bonus  - Skills 

BNSKL2 

+ 

X 

X 

Two  year  option  (HSG) 

TYOPT 

+ 

X 

X 

Two  year  option  (NHSG) 

TYOPTN 

+ 

X 

X 

Total  options 

OPTS TO 

+ 

X 

X 

Total  options  (-1) 

OPTSTO 

+ 

X 

X 

Total  options  (-2) 

OPTS TO 

+ 

X 

ar 

Pay  surge 

PASURG 

+ 

X 

X 

X 

X 

Cat  IV  limit 

CAT4LM 

+ 

X 

X 

Print  media 

PRTMED 

X 

X 

Unemployment 

DUNEMP 

+ 

X 

X 

X 

X 

Unemployment  (-1) 

DETJEMP 

+ 

X 

X 

X 

X 

Unemployment  (-2) 

DITIEMP 

+ 

X 

X 

X 

X 

High  school  policy  74 

HSPL74 

- 

X 

X 

Army  policy 

APLCY 

- 

X 

X 

Time 

TI>E 

X 

X 

X 

X 

I Definitions  of  the  independent  variables  are  given  in  Table  7. 

I 


I 

i 


^io  r.oC  contribute  to  the  regression  equations  of  cert.iin  ilependen t 
variables.  Tlie  obvious  advnnt.ige  of  eliminating  as  m.iny  of  the 
independent  variables  for  each  dependent  variable  as  possible  is  that 
it  reduces  computer  running  time.  ,\not!ier  advantage  is  that  the  regres- 
sion equation  will  be  based  on  meaningful  variables.  Forecasts  for 
12  months  into  the  future  can  be  expected  to  be  more  reliable  when 
based  on  equations  containing  variables  that  have  a logical  relation- 
ship than  when  based  merely  on  a statistical  relationship. 

REGRESSION  ANALYSIS 
Methodology 

* 

In  an  earlier  GRC  study  a nonlinear  regression  model  was  developed 
that  was  incorporated  into  the  BMD02R  program  of  the  UCLi\  BMD  computer 
package.  ,Vs  indicated  previously,  the  nonlinear  portion  of  the  combined 
model  is  used  to  determine  the  multiplicative  seasonal  factors,  while 
the  linear  portion  uses  a stepwise  procedure  to  determine  the  coefficients 
of  the  independent  variables  in  the  regression  equation.  The  analysis 
assiimes  the  dependent  variable  time  series  is  initially  in  unadjusted  or 
seasonalized  form  and  the  independent  variables  are  in  deseasonalized 
form  (e.g.,  deseasonalized  unemployment  rates).  This,  however,  is  not 
essential  for  the  methodology  of  the  model  to  hold.  It  is  only  desirable 
in  the  sense  that  the  model  computes  seasonal  adjustment  factors  for  the 
uependent  variable.  The  less  seasoned  variation  of  the  dependent  variable 
is  explained  by  the  independent  v'ariables,  the  more  is  explained  by  the 
seasonal  adjustment  factors.  These  seasonal  adjustment  factors  should  not 
be  confused  with  the  seasonal  coefficients  that  are  computed  from  the 
forecast  values  and  forwarded  to  C0MPLIP-G2. 

The  model  is  solved  iteratively  for  the  seasonal  adjustment  factors 
and  the  coefficients  of  the  linear  model.  First,  the  seasonal  coefficients 
are  set  to  1,0  and  BMD02R  is  used  to  solve  for  the  linear  regression  coef- 
ficients. These  are  then  fixed  and  seasonal  coefficients  are  determined 
by  means  of  a least  squares  fit.  The  seasonal  coefficients  are  then  fixed 
and  a new  set  of  linear  regression  coefficients  is  determined.  The  pro- 
cedure is  repeated  until  convergence  is  reached.  A schematic  of  the  proce- 
dure is  shown  in  Fig.  A. 

* 

See  Reference  4. 
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r.'iPUTS 


Fig.  4— Basic  Cycle  of  the  Deseasonali zed  Step  Regression 
Model 
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The  stepwise  procedure  requires  two  user  specified  F statistics 
as  tolerances  for  bringing  independent  variables  into  the  solution 
and  for  removing  them.  Suppose  and  F^  are  such  tolerances,  where, 
F,  u F,,.  Let  F denote  the  F statistic  of  the  variable  with  the 


1 J.  V 

largest  F statistic  that  is  not  in  the  solution.  Then,  if  F 


F„ 


and  the  variable  satisfies  the  user  imposed  entering  restrictions 

(i.e.,  sign  restriction  and  limit  on  the  number  of  members  per  family 

that  can  be  in  the  solution  at  any  one  time),  it  is  entered  into  the 

solution.  Similarly,  for  removing  variables  from  the  solution,  if 

F < F,,  the  variable  is  excluded  from  the  solution. 

V 1 

riie  model  requires  the  following  user  input: 

1.  A dependent  variable  Y = col  (v  , y„  , . . . ,y  ) which  is  tvpicailv 

1 d n ' ' 

a time  series  with  n representing  the  number  of  observed  values. 

2.  ;\n  independent  variable  data  matrix  .X  = an  n by  p matrix 

where  p is  the  number  of  independent  variables.  Each  column  vector  repre- 
sents a time  series  of  n observed  values  for  an  independent  variable. 

3.  The  number  of  observations,  n,  of  the  (seasonaiized)  dependent 
variable;  Y.  The  number  of  observations  should  be  more  than  one  cycle,  r 
(r  = 12")  of  the  seasonal  period  (say,  at  least  two  cycles). 

The  model  has  the  following  mathematical  form: 


-1 


^o  ^ ^1  ^ ^ ^12  ^2  ^ 


+ X,  a 
Ip  p 


-1 


^o  ^^21  ‘"2  ^ ^22  ^2  ^ 


+ X a 

2p  p 


(2) 


-1 


a + x,a  + x,  a.+ 
o nl  n n2  2 


+ X a 
np  p 


. -1  -1  -1 
where  s ^ ... 


. are  the  reciprocals  of  the  seasonal  coefficients 


and  a^,a2,...,a  are  the  regression  coefficients  of  the  independent  vari- 
ables. Both  sets  of  coefficients  are  determined  bv  the  model.  Since  s 

i 

is  a seasonal  coefficient  it  is  assumed  that  s,  = s,  for  i = l,2,...,n-r. 

1 i+r 

The  basic  procedure  used  in  solving  for  s^,  i = l,2,...,n  and  the  a^ , 
j = l,2,...,p  is  the  method  of  least  squares  applied  to  the  following 
f unc  tion : 


7 


nu 


-1 


in  = 


-1 

s.,  a. 


i=l 


-1  ^ 

"i  ^ - - ^ij  ‘‘j 

j = o J 


(3) 


As  stated  earlier,  the  solution  procedure  is  iterative.  I’or  iteration  k 


A - ’ A ^ A 

let  y.(,k)  = s."(k)  y.  and  let  x.  (k)  = S x.  . a.(k).  For  k = 1, 

^ ^ j=o  ^ 

y.(l)  = V.  since  initially  s.  ^ is  assumed  to  be  1.0  for  all  i,  and 
1 ' 1 1 


a^(l)  for  j = are  determined  by  BMD02R.  At  iteration  k, 

k > 1,  the  method  of  least  squares  is  used  to  solve  Eq  (4)  for  s.^(k), 

- 1 _ 1 

i = l,2,...,r,  where  r is  assumed  to  be  12  and  s^'(k)  = for 

i = 1,2,... ,n-r. 


nun 

„-l 


\ y x~  I 


s'^(k)  y. 


x.(k-l) 


(4) 


(k) 


where  x^(k-l)  is  a known  quantity  having  been  determined  by  ordinary  least 
squares  in  iteration  k-1.  The  equations  to  be  solved  for  s^^(k)  are  obtained 
by  taking  the  partial  derivatives  and  setting  them  to  zero.  That  is. 


a3 


1 


= 0 


d Sj^^(k) 


(5) 


yields,  for  i = 1,2,. ..,r. 


s"^(k)  ’ 2 ^ 2 ^ 2 

^ ^i  ^ ^i+r  + yi+2r  ^ ••• 


(6) 


The  second  half  of  iteration  k makes  use  of  the  newly  computed  s^^(k) 
in  computing  y^(k)  and  BMD02R  is  used  to  solve  for  the  a^ (k)  in  Eq  (7). 
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min 

a.(k)  “ J 

Tne  iterative  process  is  assumed  to  converge  when 

! s"^k)  - s~^k-l)  1 u .001  (8) 

for  all  i and 

I " .Ij(k-l)  I i .001  for  all  j. 

If  convergence  is  slow  or  if  there  is  no  convergence,  the  process  is 
terminated  at  25  iterations.  Experience  has  shown  that  convergence  is 
usually  reached  within  25  iterations. 

Since  the  BMD02R  is  a stepwise  regression,  only  the  independent 
variables  that  pass  the  F tolerance  and  meet  the  sign  criterion  are  intro- 
duced into  the  regression  equation.  Generally,  this  means  that  most  of  the 
independent  variables  are  not  in  the  equation  if  the  number  made  available 
is  in  the  neighborhood  of  15  or  20.  One  case  was  experienced  at  GRC  where 
different  sets  of  independent  variables  entered  the  stepwise  regression 
on  successive  iterations.  After  25  iterations,  for  such  a case,  the  regres- 
sion equations  will  include  only  one  set  of  the  independent  variables.  It 
should  be  noted  that  this  case  is  a rarity. 

Output 

Output  of  the  regression  computer  program  includes  the  time  series  of 
the  historical  and  computed  (using  the  regression  equation)  values  of  the 
dependent  variable,  the  deseasonalization  factors,  the  coefficients  of 
the  variables  In  the  model,  the  forecasts,  the  residual  errors,  the 
standard  error  of  estimate,  the  coefficient  of  multiple  determination 
and  the  correlation  matrix.  The  deseasonalization  factors  are  those  used 
by  the  model.  They  have  been  determined  in  conjunction  with  the  coefficients 
of  those  independent  variables  in  the  equation.  They  are  not  the  seasonal 
coefficients  to  be  used  by  CCMPLIP.  These  are  computed  at  a later  step 
based  on  the  forecast  values. 


i=l 


(k)  - L X.  . a.(k) 
■ t . tJ  1 

j=o 


(7) 
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The  residual  errors,  standard  error  of  estimate,  and  the  coefficient 

of  multiple  determination  measure  the  goodness  of  fit  of  the  regression 

equation  to  the  historical  data.  For  example,  the  square  of  the  coefficient 

of  multiple  determination,  R“ , Cl;  R*”  1,  measures  the  variation  in  the 

dependent  variable  that  is  explained  by  the  regression  equation.  The  closer 
2 

R^  is  to  1.0,  the  better  the  fit  of  the  regression  equation.  Typical  values 
2 

for  R“  are  in  tiie  neighborliood  of  .90.  Tlie  definition  is  as  follows: 

j Q _ residual  sum  of  squares 
total  sum  of  squares 

The  standard  error  of  estimate,  SE,  is  defined  by 

_ residual  sum  of  squares 

residual  degrees  of  freedom 

The  standard  error  of  estimate  is  a relative  measure  since  its  magnitude 
varies  with  the  magnitude  of  the  dependent  variable.  The  residuals  should 
be  checked  for  sequences  with  the  same  sign,  indicating  some  unexplained 
variation  perhaps  by  the  omission  of  some  variable,  such  as  a policy  or 
program  variable. 

In  the  event  an  independent  variable  that  is  expected  to  be  included 
in  Che  regression  equation  is  not  included,  the  reason  may  be  that  it  is 
highly  correlated  with  another  independent  variable  that  is  in  the  equa- 
tion. This  may  be  determined  by  examining  the  correlation  matrix.  .-Another 
reason  for  a variable's  exclusion  may  be  that  it  has  a sign  restriction 
that  prevents  it  from  entering  the  regression  equation. 

The  output  of  the  preprocessor  program  lists  the  independent  variables 
(including  copies  of  independent  variables  that  have  been  lagged)  that  the 
user  has  made  available  for  each  dependent  variable.  The  beginning  of  the 
regression  output  gives  a similar  list  but  with  the  dependent  variable 
included  as  the  first  variable.  Since  the  dependent  variable  is  not 
Included  in  the  preprocessor  list,  the  index  of  an  independent  variable 
in  the  regression  output  list  is  one  more  than  the  corresponding  index  in 
the  preprocessor  list.  The  user  should  take  note  of  this  when  making  cross 
reference  checks. 


30 


1 


The  regression  statistics  are  printed  for  t!ie  first  iteration  of 
the  solutiv)n  procedure  and  aRain  for  the  last.  The  statistics  of  the 
last  iteration  are  of  greater  interest  since  this  is  the  final  solutioii. 
Statistics  are  printed  for  eacti  seep  of  the  linear  regression.  Tlie  last 
step  is  followed  by  a "Summary  Table"  that  documents  the  change  in  the 
regression  equation  at  each  step.  Tliis  is  followed  by  a table  entitled 
LIST  OF  RESIDCALS.  The  five  columns  listed  in  this  table  are  entitled 
CASE.  YDEP,  YHAT.  RESIDUAL,  and  PCTERK. 

The  column  entitled  CASE  lists  tlie  inde.x  number  of  tlie  time  series. 

The  inde.x  1 corresponds  to  the  first  month  represented  in  the  regression. 

For  e.-iample,  if  the  first  month  oi  the  cata  base  is  Januar’/  1971  and  the 
m.aximum  months  of  lag  input  to  any  of  the  independent  v'ariablos  is  2. 
then  the  index  1 corresponds  to  the  montfi  of  Mareh  1971  rather  than 
January.  Tlie  last  12  months  of  data  listed  corresponds  to  the  projection 
period  and  is  meaningless  in  this  report  except  perhaps  for  the  Yh.AT  which 
are  computed  by  the  regression  equation  using  only  the  linear  coefficients. 

The  column  entitled  YDEP  lists  the  historical  time  series  of  the 
dependent  variable.  Eacii  of  these  quantities  has  beeii  divided  by  the 
appropriate  seasonal  adjustment  factor  (i.e.,  the  left  h;ind  side  of 
Eq  (2).  The  last  12  values  are  the  exception.  These  are  obtained  by 
extending  the  last  historical  value  for  12  months  without  dividing  by 
the  seasonal  adjustment  factors.  The  column  entitled  YHAT  represents 
the  values  computed  by  the  linear  portion  of  the  regression  equation 
(i.e..  the  right  hand  side  of  Eq  (2)).  The  RESIDUALS  are  defined  as 
YDEP  - YHAT  and  the  PCTERR  as  RES  I DUAL /YDEP . 

The  next  table  that  is  printed  out  is  entitled  MULTIPLICATIVE 
SE1\S0.\'ALS . This  table  has  the  following  five-column  heading;  C.ASE. 

YDEF,  YHAT,  RESIDUAL,  and  SEASONAL.  The  C.ASE  column  is  defined  as 
before.  The  definitions  of  YDEP  and  Y’HAT  are  slightly  different  from 
those  in  the  previous  table.  The  column  Y'DEP  consists  of  the  time  series 
of  the  actual  historical  values  of  the  dependent  variable  (without  division 
by  the  corresponding  seasonal  adjustment  factor  as  was  the  case  in  the 
previous  table).  The  YHAT  column  consists  of  the  product  of  the  appropriate 
seasonal  adjustment  factor  (for  each  month)  and  the  monthly  value  computed 
by  the  linear  portion  of  the  regression  equation.  The  SE.ASONAL  column 
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consists  of  the  se  isonal  adjustment,  factor  for  each  historical  month 
that  is  computed  by  Che  nonlinear  portion  of  the  repression  model.  Tlie 
RESIDU.VL  column  is  also  defined  as  before. 

hnlike  the  previous  table,  the  range  of  this  table  does  not  include 
the  12  future  months.  The  forecast  values  are  given  in  the  next  table 
entitled  FUTUllE  TIME  STRE.VM  with  the  three  column  headings  CASE,  FORECAST 
and  SEiVSON^VLS.  Tlie  C<\SE  and  SE.VSO.NAL  columns  are  defined  as  before,  the 
FORECAST  column  contains  the  12  months  of  seasonally  adjusted  projected 
values.  It  is  these  forecast  values  that  are  used  in  the  post  processor 
programs  and  subsequently  in  C0MPLIP-G2. 

Sample  Regression  Results 

The  results  discussed  here  are  taken  from  the  NPSGM  system  test  run. 

The  main  purpose  of  the  test  run  was  to  test  the  computer  programs  rather 
tiian  to  provide  evidence  of  the  quality  of  forecasts  of  N'PS  gains.  Exten- 
sive analysis  and  experimentation  with  the  model  was  not  possible  because 
of  budget  constraints.  The  test  results  should  be  considered  illustrative 
of  the  model's  capability  rather  Chan  representative. 

Although  the  volunteer  data  that  were  available  extended  through  June 

1974,  only  data  through  December  1973  were  used  and  projections  for  Ci  1974 

were  made.  The  reason  for  this  is  that  Che  model  should  only  be  used  to 

project  supply  limited  enlistees.  For  several  months,  beginning  with  December 

* 

1974,  the  volunteers  were  enlisted  on  a demand  limited  basis,  making  it  a 
poor  time  interval  in  which  to  test  the  model.  Therefore,  the  forecasts  for 
December  1974  were  greatly  overprojected  by  the  test  run  and  have  been 
excluded  from  the  statistics  shown  in  Table  9. 

Two  regression  runs  were  made  for  each  of  the  three  population  groups 
shown  in  Table  9,  one  for  blacks  and  one  for  non-blacks.  The  R"  coefficient 
measures  the  goodness  of  fit  of  the  regression  results  to  the  historical 
data.  Values  in  the  neighborhood  of  .9  are  generally  considered  satisfactory. 
3oth,  the  mean  errors  and  the  mean  absolute  errors  are  computed  on  the  basis 

* 

For  future  applications  of  the  model  where  the  demand  limited  accessions 
are  a part  of  the  historical  data,  one  or  more  special  independent  variables 
(perhaps  based  on  quotas)  should  be  Introduced  to  help  explain  the  enlist- 
ments during  this  period  of  departure  from  supply  limitation. 
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Table  9 


TEST  RESULTS 
(Excluding  December) 


Population  group 

9 

R“ 

(Black) 

9 

R*" 

iNot  black) 

1 

1 " Mean 

j error  | 

7(  Mean 

Absolute  ..-rror 

I.  .Male  NH.SG,  Mental  Category  .97 
I,  11,  til 

. 93 

4.4 

11.0 

2.  Male  HSG,  .Mental  Category 
1,  II,  III  (Case  n 

.89 

00 

CM 

O 

I 

12.0 

3.  Male  HSG,  .Mental  Category 

.92 

GO 

- 7.6 

10.0 

I,  II,  III  (Case  2) 


of  the  first  11  months  of  the  projections.  The  mean  error  reflects  tha 
cancellation  of  positive  and  negative  error  terms,  whereas  the  mean  absolute 
error  is  based  on  the  magnitude  of  the  errors  without  regard  to  sign. 

Population  Croups  2 and  3 are  identical;  however,  the  regression  used 
different  sets  of  independent  (or  explanatory  variables).  The  time  variable, 
defined  in  Table  7,  was  included  for  Group  3 and  played  an  important  role 
in  entering  the  regression  equation,  whereas,  it  was  omitted  from  Group  2. 
.Notice  that  it  made  an  improvement  in  the  forecasts.  Additional  improvements 
could  very  likely  be  achieved  by  further  analysis  and  experimentation  with 
the  model. 

Table  10  compares  the  forecasts  with  the  actual  values  for  the  population 
Group  1 defined  in  Table  9.  Note  the  poor  forecast  for  the  month  of  December 
1974.  As  was  stated  earlier,  the  reason  for  this  is  that  December  1974  was 
the  beginning  of  the  period  in  which  accessions  were  demand  limited  rather 
than  supply  limited.  Whenever  large  deviations  such  as  are  shown  for  the 
months  of  January  and  July  are  projected,  then  further  analysis  should  be 
undertaken  to  try  to  determine  the  causes.  For  example,  an  independent 
variable  may  need  to  be  constructed  to  reflect  the  result  of  a program  or 
policy  decision  that  would  explain  these  deviations. 

It  should  be  stated  that  proper  use  of  the  model,  requires  analytical 
expertise  in  (1)  interpreting  and  analyzing  the  statistical  results  such 
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Table  10 


COMPARISON  OF  FORECASTS  Wl'm  ACTUALS  FOR  CY74 


NTS  Males,  NllSC,  Mental  Category  I,  II,  III 


FORECAST 

Difference 

Forecast 

-Actual 

Month 

Case  6 i 
Slack  ! 

Case  7 
Not  Black 

! 

Total 

Actual 

o/ 

/o 

Di f f erence 

Jan 

219S.6 

6541.9 

8741 

7056 

1685 

23.9 

Feb 

1484.0 

4670.8 

6155 

5910 

245 

4. 1 

Mar 

1995.0 

4026.0 

6021 

5777 

244 

4.2 

Apr 

1720.8 

3713.0 

54  34 

5765 

-331 

-5.7 

May 

2044. 1 

3116.4 

5161 

5477 

-316 

-5.3 

J un 

3199. 3 

5454.5 

8654 

74  70 

1184 

15.9 

Jul 

2016.6 

5099.6 

7116 

5768 

1348 

23.4 

Aug 

2245.9 

4476 . 6 

6723 

7076 

-353 

-5.0 

Sep 

2210.5 

3859.5 

6071 

7405 

-1334 

-18.0 

Oct 

2047.5 

5484.  1 

7532 

6818 

714 

10.5 

Nov 

1689.4 

4911.8 

6601 

6536 

65 

1.0 

Dec 

1484.5 

4748.0 

6233 

2959 

32  74 

110.6 

Totals 
Si  Mean 

Error  24,338 

56,104 

80,442 

74017 

6425 

8.  7 

Totals  and  Mean  Error 
Excluding  December 

74,208 

71058 

3150 

4.4 

Mean  Absolute  Error 

7819 

11.0 

as;  oiasLicicy,  tests  of  statistical  s ign  i f i cance  , col  1 inear i tv  , 
resi<iuals,  coefficient  of  multiple  deterraination  (R“),  and  standard 
error  of  estimate;  (2)  formulation  of  explanatory  variables  based  on 
economic  conditions,  policy  decisions  and  programs,  and  (3)  use  of 
such  vari.ables. 

POST  PROCESSOR 

One  of  the  post  processor  programs  aggregates  tiie  monthly  forecasts 
into  at  most  four  population  categories  used  in  COMPLIP.  Monthly  fore- 
casts for  the  supply  limited  groups  can  be  combined  to  form  data  for 
COMPLIP  constraints  on  annual  availability.  Since  the  forecasts  are 
based  on  several  years  of  historical  data  they  are  used  as  the  basis 
for  computing  the  seasonal  coefficients  used  by  COMPLIP  by  simply  nor- 
malizing the  first  12  months  of  tne  forecast  values.  Suppose  for  a 
given  COMPLIP  C-group  the  forecasts  are  denoted  by 

the  corresponding  seasonal  coefficients  by  s ,s., , . . , ,s  ^ . The  computation 
is  as  follows: 

V . 

= -p-  " — , i = 1,2 12. 


Another  function  of  Che  post  processor  is  to  produce  printer  plots  of 
historical  and  forecast  values  of  each  of  the  dependent  variables.  This 
is  accomplished  by  the  graph  program.  Tire  preprocessor  and  the  regression 
routines  generate  the  files  of  the  data  to  be  plotted. 
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Chapter  3 


USAGE 


This  chapter  is  oriented  towards  the  application  of  N'PSGM  and  hence 
contains  such  information  as  input  card  and  file  descriptions  and  operat- 
ing procedures.  The  MPSGM  consists  of  six  programs  defined  as  follows: 


Program 
Program  'Ha 
Program  ;/2 


Program  '*3 
Program  '!‘4 


Program  #5 


Data  generator,  used  for  initial  creation 
of  the  frequency  file. 

Frequency  file  update  routine,  used  for 
updating  an  existing  frequency  file. 
Regression  preprocessor,  used  for  preparing 
regression  inputs,  and  h.eader  data  for  the 
graph  and  aggregation  files. 

Regression  program. 

Graph  program,  used  for  developing  printer 
graphs  of  historical  and  projected  population 
time  series. 

Aggregating  program,  used  for  aggregating 
projected  population  groups  to  form  C-groups 
used  in  C0MPLIP-C2  and  compute  seasonal 
coefficients. 


The  files  in  the  schematics  for  each  of  the  programs  are  denoted 
as  tape  files,  although  they  may  be  disk  files.  For  example,  if  logical 
unit  10  is  a disk  file,  it  is  commonly  referred  to  as  TAPEIO.  Each  file 
has  an  MT  (magnetic  tape)  identifier  for  ease  in  cross  referencing.  Table 
11  gives  a summary  of  the  files,  identifying  the  creating  program,  the 
using  program,  the  logical  unit,  and  the  MT  identifier.  Table  12  summarizes 
the  file  characteristics  and  approximate  volume  of  data  to  be  expected.  The 
files  are  identified  with  the  MT  identifier  used  in  Table  11  or  the  program 
schematics  presented  for  each  program  description. 
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TabJe  11 


FILE  SUM>URY 


Logical 

MT 

Action 

Program  •/ 

uni  t 

identifier 

MV A file 

used  by 

1 

1 

MTl 

ELIM  III  Cohort  file 

used  by 

1 

2 

MT2 

Fr.jquency  file 

created  by 

1 

3 

MT3 

alternatively 
created  by 

la 

2 

MT5 

jsed  by 

la 

1 

>113  or  MT5 

used  by- 

2 

; 

MT3  or  MT5 

Gain/loss  transaction 
file 

used  by 

la 

4 

MT4 

Independent  variable 
data 

used  by 

2 

10 

MT6 

Regression  input  file 

created  by- 

o 

7 

MT7 

used  by- 

3 

7 

MT7 

Regression  control  cards 

created  by 

2 

50 

MTS 

used  by 

3 

5 

MTS 

Scratch  file  1 

used  by 

3 

1 

MT9 

Scratch  file  2 

used  by 

3 

2 

MT9a 

Aggregation  file 

created  by 

2 

15 

MTIO 

augmented  by 

3 

15 

MTIO 

used  by 

5 

2 

MTIO 

Graph  file 

created  by 

2 

20 

MTU 

augmented  by 

3 

30 

MTU. 

used  by 

4 

1 

MTU 

Aggregated  C-groups 

created  by 

5 

10 

MTl  3 

Graphs  (to  be  copied 
to  printer) 

created  by 

4 

10 

MTU 
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Table  1 


I 

j SLT-L-^l/XRY  OF  FILF  C!L\R.\CTEKISTICS 


1 

MT  identifier ■ 
of  file  1 

COBOL/ 

FORTtCXN 

1 

Binar>’/ 

BCD 

1 Maximum 

record* 
length  1 

Fixed/ 

' variable 

* 

Maximum 

size 

ii  * 

MT  1 

BCD 

60 

F 

MT2 

C 

BCD 

137 

V 

137,000,000 

MT3 

C 

BCD 

9 

F 

125,000 

MT4 

C 

BCD 

135 

V 

13,500,000 

MTS 

C 

BCD 

9 

F 

125,000 

MTb 

F 

BCD 

30 

F 

50,000 

MT7 

F 

BCD 

10 

F 

50,000 

MTS 

F 

BCD 

80 

F 

50,000 

MT9 

r 

BIN 

6 

V 

1,000 

MT9a 

F 

BIN 

255 

V 

30,000 

MTIO 

F 

BCD 

10 

V 

5,000 

MTU 

F 

BIN 

10 

F 

5,000 

MTU 

F 

BIN 

132 

V 

5,000 

MTU 

F 

BCD 

60 

V 

3,000 

.<T14 

F 

BCD 

132 

V 

200,000 

* 

Measured  in  number  of  characters  for  coded  files  and  number  of 
words  for  binary  files. 

This  file  need  not  be  processed  by  USAMSSA. 


DATA  CENLR.\TOK  (PROGR,VM  1) 

The  data  generator  is  a (;0D0L  based,  batch  mode  program  which 
creates  a coded  fre()uency  file  of  non-prior-service  volunteer  data.  A 
schematic  of  it  is  shown  in  Fig.  5.  Input  to  this  program  is  the  (]RC 
MVA  data  base,  used  to  e.xtract  draftee  data  prior  to  June  1973,  and  the 
other  input  is  the  ELIM  Cohort  File,  which  provides  volunteer  data  from 

A* 

January  1971  to  the  present.  Tliis  program  was  conceived  to  be  run  onlv 
one  time,  js  another  program  ser'.'es  to  update  the  frequency  file  when 
new  data  become  .ivailable.  A flowchart  of  the  progr.im  is  given  in  Fig.  n. 

An  internal  data  array,  250  x 50  (where  50  is  a dimension  that  repre- 
sents the  number  of  months  of  data  that  the  program  may  store),  provides 
temporary  storage  for  volunteer  iccession  counts  until  the  end  of  the 
input  data.  Tlie  program  tabulates  series  r.jr  each  of  250  population 
categories  defined  in  Table  2 of  Chap.  2.  The  categories  are  partitioned 
into  two  sets.  Tliey  are  mutually  exclusive  an  1 e.xhaustive  within  each 
set.  The  first  set  or  partition  consists  of  population  categories  1 - 19C, 
the  second  197  - 250. 

Data  items,  read  from  input,  which  make  up  the  250  different  time-series 
are:  sex,  race,  age.  term  of  service,  education,  mental  category  and  bonus 
received.  Categor;/  numbers  1--4  are  comprised  only  of  females  and  ordered 
by  civilian  education,  race  and  mental  category.  Category  numbers  5-19o 
include  only  males  ordered  by  civilian  education,  race,  mental  group,  age 
;ind  bonus  received.  Category  numbers  197-250  include  both  males  and  females 
ordered  by  civilian  education,  race,  mental  group  and  term  of  enlistment. 

The  separate  mention  of  category  numbers  1-4  is  solely  to  inform  the  user 
that  these  four  cells  contain  only  female  accessions  data  that  are  under 
slightly  different  sequence  arrangement  than  are  the  remainder  of  Parti- 
tion  1.  For  every  valid  record  processed  by  programs  '/I  or  ifla,  one 
entry  is  made  in  each  of  the  two  population  partitions,  so  that  both 
partitions  reflect  the  same  population.  However,  the  populations  may  differ 
slightly  for  the  two  partitions  because  of  the  different  data  items  defin- 
ing each  ot  then  and  the  possible  omission  of  some  records  due  to  invalid 
codes . 


* 

Program  la  is  the  update  program  to  be  discussed  in  the  next  section. 

'.vhen  this  processing  occurred  the  cohort  file  contained  data  through 
Utne  1974. 
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CoQpute 

series 

location 

based 

on  sex. 

race , 

term. 

etc. 

Upon  reaching  cnd-of-file  on  tlio  input  data,  a report  summarizing 
tlio  data  processed  is  printed,  along  with  a monthly  count  of  the  250 
series.  These  same  data,  written  to  tape  in  time-series  form,  become 
the  frequency  file. 

Files 

Input : 

T.\PE1  GRC  M\'A  accession  file 

tape:  ELIM  III  Cohort  file 

Output : 

T.APE3  Frequency  file 

PRI.N'TER  Report  output,  statistics  on  10  counts  and 

input  data  validation  summary 

T/\PE  1 . Tliis  is  a coded  file  that  contains  accession  data  used  to 
extract  draftee  information.  It  is  of  fixed  record  length  with  40  records 
per  block  and  60  characters  per  record  (MVA  file.  Mil). 

T.APE2 . This  is  a coded  file  containing  RA  enlisted  data.  It  con- 
tains variable  length  records  from  60  to  137  characters  (ELIM-III  cohort 
file,  MT2). 

TAPE3.  Also  a coded  file,  Ti\PE3  is  the  output  frequency  file  having 
a fixed  record  length  of  nine  characters  per  record,  and  250  records 
per  block  (.MT3) . 

Printer.  This  file  displays  total  of  records  read  and  written  and 
data  validity  counts.  It  also  gives  a report  on  the  number  of  accessions 
per  month  which  are  in  each  of  the  250  series. 

The  Job  Control  for  CBC  Cyber  70  for  the  Frequency  File  Generator  is 
as  follows: 

T2000,  MT3  Program  #1 
COBOL  (LR,D) 

REQUEST,  TAPE  1,  R,  S. 

REQUEST,  TAPE2,  R. 

* 

VSN,  TAPEl  = 2434/2712/0538/3771/2043/1268. 

VSN,  TAPE2  = 2546/2571/2757. 

REQUEST,  TAPE3,  W,  S,  VSN  = SAVE.  FREQ  FILE 

FILE  (TAPEl,  BT-K,  RT=F,  MBL-2400,  FL=60,  RB=40 , CM=YES) 

A 

These  are  internal  GRC  type  numbers. 
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FILE  (TAPE2,  BT=C,  RT=Z,  FL=137,  CM=YES) 

FILE  (TAPE3,  BT=K,  RT=F,  MBL=2250,  FL=09.  RB=250,  CM=YES) 

LDS  ET  ( FI  LES  = T.\P  E 1 /TAPE2  /TAPE3) 

LGO 

FREQL'ENCT  FILE  UPDATE  ROUTINE  (PROGRAM  iHa) 

riiis  COBOL  program,  like  the  Data  Generator,  creates  a monthly  time- 
series  of  250  elements,  based  on  specific  data  fields  from  the  gain/loss 
transaction  file.  A schematic  of  this  program  is  given  in  Fig.  7 and  a 
flowchart  is  given  in  Fig.  8. 

Inputs  to  this  program  are;  (I)  the  gain/loss  file,  (2)  the  fre- 
quency file  and  (3)  one  parameter  card.  Outputs  of  the  program  are: 

(1)  a new  frequency  file  with  new  data  applied  and  header  records 
adjusted,  and  (2)  a printer  output  file  which  displays  statistics  of 
the  run  and  gives  summaries  of  the  data  processed.  The  program  has  an 
internally  dimensional  array  of  250  :<  6,  which  allows  input  data  to  cover 
a six-month  range.  Data  outside  this  range  will  be  ignored. 

A parameter  card  specifies  the  last  calendar  month  for  which  the  fre- 
quency file  is  to  be  updated.  For  the  normal  update  mode,  this  last 
update  month  is  later  than  the  last  month  of  the  frequency  file  prior  to 
the  update.  However,  it  is  possible  to  add  data  for  prior  months,  with 
such  late  transaction  data  being  combined  with  the  correspondi'.ig  data 
already  on  the  file.  If  the  date  on  the  parameter  card  is  more  than  six 
months  beyond  the  last  month  of  the  frequency  file,  an  error  condition  is 
assumed  and  the  program  terminates.  Other  possible  combinations  of  data 
base  and  parameter  cards  are  described  in  Fig.  9. 

Upon  finding  the  parameter  card  date  within  the  update  capability 
of  the  program,  as  much  of  the  original  frequency  file  that  will  not  be 
altered  during  the  run  is  copied  to  the  output  frequencv  file.  After 
processing  of  the  data,  the  internal  arrays  are  summed  with  the  correspond- 
ing months  of  data  on  the  original  frequency  file  and  copied  to  the  output 
frequency  file. 

The  frequency  file  contains  four  header  records  which  contain  starting 
date  of  the  time-series  and  the  length,  in  months,  of  the  data  base.  During 
an  update,  corrections  are  made  to  the  header  to  reflect  the  addition  of  new 
data. 


'*3 


[Open-Flies 


Read  Updat 
Yr/Mo 


Copy  last 
6 months  / 
to  work 
array  / 


/ Read  data 
jhase  Yr/>!o 
from  header i 


This  program  updates  5 
months  prior  to,  and 
including,  the  date 
specified  on  the 
parameter  card. 


/Read  gains 
/ file  and  / 
I categorize 
data  / 


/ Read  data 
/base  length 
/ (#Mo's)  j 
/from  header/ 


Length2-*Updatc 
(Mo/Yr)  minus 
data  base 
(Mo/Yr> 


Add  stored 
6 months  to 
new  6 months 


Length2  > 
,,data  base 
^4^ngth/ 


Print  error 
message 


/ Copy  to 
/ TAPE2  . 
/(new  master;) 


Copy  old  base 
(that  is  not  mod 
Ifled)  to  new 
base 


(Copy  data  / 
for  new  / 
mon  ths  to  / 
TAPE2  / 


Dump 

statis.  tics 


Stop  Run 


Fig,  » — Flowchart  of  Update  Program  (Program  #la) 
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A knowledge  of  the  ending  point  of  the  existing  frequency  file  is 
critical  to  a successful  update. 

Files 

input : 

T.\PE1  Frequency  file  (original) 

T.\PF,4  Cain/loss  file 

CARD  Parameter  card  (1) 

Output : 

T<\PE2  Frequency  file  (newly  generated) 

PRINTER  Lists  10  operation  counts,  input  parameter 
card  list  and  new  header  inrormation 

TAPE  1 . This  is  the  coded  frequency  file  generated  by  Program  ')! 

(Data  Generator),  (.'■1T3) ; or  by  'Ua  as  MT5  in  a previous  run. 

T.AP  E2 . This  file  is  identical  to  TAPEl,  in  format,  but  reflects 
cnanges  in  the  header  and  total  file  length,  due  to  updating.  In  addition 
to  the  adding  of  new  months  of  data,  earlier  months  may  be  different  if 
late  transactions  are  posted,  (MT5). 

TAPE4 . Gain/loss  file  that  is  used  to  update  the  frequency  file 
(MT4). 

Printer.  Contains  information  pertaining  to  the  update,  such  as 
starting  and  ending  times  of  the  update  period,  and  how  many  months  of 
new  data  will  be  added.  It  also  displays  new  values  assigned  to  replace 
the  header  records  of  the  frequency  file. 

Input  Card.  A single  parameter  card  identifies  the  last  calendar  month 
that  the  update  will  cover.  The  format  is  simply:  .Month/Year  (12,  I.X,  12). 

All  comparison  against  the  input  frequency  file  starting  date  is  based 
on  this  date,  and  internal  storage  registers  also  are  based  on  the  accuracy 
of  the  date  on  this  card. 

A typical  example  of  a parameter  card  for  a particular  update  is  as 
follows : 

Data  base  length  = 32,  with  last  month  of  data 
base  being  06/75. 

Parameter  card  being  of  the  form  10/75  would  produce  4 months  of  new  data 
and  update  2 months  of  previously  existing  data,  if  any  data  exists  for 
these  two  months. 
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i^GRESSION’  PREPROCESSOR  A.N'D  REGRESSION  PROGR^VMS  (Programs  !I2  and  if3) 

Both  are  FORTRiVN  programs.  Ail  tlie  input  files  and  control  cards 
are  supplied  to  the  preprocessor  program,  wliich  in  turn  prepares  all  the 
input  for  the  regression  program.  In  fact,  in  order  to  r\in  the  regression 
program,  the  preprocessor  must  first  be  exercised  in  order  to  prepare  the 
Input  for  the  regression  program.  The  two  programs  must  be  run  under  the 
same  job  control  because  one  file  (.MTIO)  that  is  initially  written  in  the 
preprocessor  is  continued  to  be  written  by  the  regression  program  and  there- 
fore must  not  be  rewound  at  the  start  of  the  regression  run.  A schematic 
of  the  two  programs  is  shown  in  Fig.  10. 

By  use  of  input  parameter  cards,  the  user  is  able  to  select  many  com- 
binations of  dependent  variables  from  the  frequency  file,  along  with  inde- 
pendent variables  that  help  to  explain  variations  in  the  dependent  variable 
^ind  attempt  to  find  patterns  so  as  to  forecast  these  variables  for  the  future. 

For  each  dependent  variable  selected,  the  user  is  able  to  select  inde- 
pendent variables  to  be  regressed  against  the  dependent  variable.  This 
constitutes  a single  problem.  A subproblem  would  include  the  use  of  the 
s.ame  dependent  variables.  A listing  of  the  independent  variable  time  series 
is  given  in  .\pp . A. 

The  regression  routine  used  in  conjunction  with  the  preprocessor  is 

the  nonlinear  stepwise  package  described  in  Chap.  2.  The  BMD02R  users 
* 

manual  can  be  referenced  to  help  explain  the  listing  of  most  of  the  input 
data,  which  is  generated  for  the  user  with  the  preprocessor. 

Frequency  File  Arrangement  and  Aggregation 

The  frequency  file  is  comprised  of  250  cells  per  month  for  as  many 
months  as  the  data  base  covers.  These  250  cells  are  broken  into  two 
sets  (or  partitions)  consisting  of  196  and  54  categories,  respectively. 

For  any  particular  run,  a maximum  of  20  aggregations  can  be  made 
from  either  of  the  partitions,  thereby  creating  20  dependent  variables. 

Care  must  be  taken  to  ensure  that  for  a selected  run,  all  choices  of 
dependent  variables  come  from  the  same  partition.  If  any  cells  within 

•k 

See  Reference  5 of  Chap.  2. 
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Fig.  10 — Steps  2 and  3 in  the  Projection  of  iiPS  Gains 


tlie  selected  partition  are  omitted,  an  error  message  will  be  given,  ;is 
will  the  case  wiiere  a cell  from  the  wrong  partition  is  selected  to  form 
one  of  tlie  dependent  variables.  Neither  of  these  two  error  conditions 
will  terminate  the  run,  but  messages  will  be  given,  listing  tlie  cells 
which  were  duplicated,  omitted  or  included  from  the  wrong  partition. 

The  preprocessor  will  only  abort  if  the  input  card  sequence  is  in  error 
or  if  one  of  the  cards  specifying  the  selected  cells  is  not  in  ascending 
order.  (Refer  to  series  card,  type  3,  for  a further  description  of  the 
e r ro  r . ) 

Refer  to  tlie  detailed  tables.  Tables  13  and  14  of  the  frequency  file 
layout,  in  order  to  select  cells  that  are  to  be  combined  to  form  each 
dependent  variable. 

File  Description 

Preprocessor  input  consists  of: 

1)  T.-VPEl  (.MT3  or  MT5)  ELIM-COMPLIP  frequency  (dependent 
variable  data) 


2) 

T.\PE10 

(MT6) 

Independent  variable  time-series  data  base 
to  be  maintained  by  means  of  the  system 
update  or  edit  features. 

3) 

TAPE5 

User  control  cards. 

Preprocessor  output 

consls ts  of : 

1) 

TAPE50 

(MTS) 

Regression  control  card  file. 

2) 

TAPE? 

(MT7) 

Dependent  and  independent  variable  time- 
series  . 

3) 

TAPE6  : 

and 

Printer  file  with  all  user  control  card 

System 

printer 

activity,  regression  card  image  input 
data  and  statistics  on  data  selected  for 
each  regression. 

4) 

TAPE20 

(.MTl  1) 

Date  and  data  structure  information  for 
dependent  variable  graphs. 

5) 

TAPE  15 

(.MTIO) 

Data  structure  information  for  aggregations 

Regression 

routine 

input : 

1)  Items  (1),  (2),  (4)  and  (5)  under  preprocessor 

output  using  the  same  logical  units  except  for 
TAPE  50,  which  is  read  as  TAPES. 
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Table-  13 


CAFECORIES  KOR  TRUE  VOLUNTEER  NTS  GAINS  FREQUENCT  DISTRIBUTION 

(Population  Partition  1) 


Feraalo 


Category 
nutnbe  r 1 

Givi 1 ian 
education 

Race 

Mental 

group 

1 

Age 

Bonus 

1 

HSG^ 

black 

1.2 

all 

no 

n 

It 

non-b lack 

It 

It 

M 

3 

GED*^ 

black 

tl 

tl 

tt 

4 

It 

non-black 

M 

It 

It 

Male'^ 


Category 
numbe  r 

Civilian 

education 

Race 

1 Mental 

I group 

1 Age 

Bonus 

5 

HSG 

b lack 

1.2.3A 

<18 

yes 

6 

It 

1 1 

II 

tl 

no 

7 

M 

II 

II 

o 

V 

00 

yes 

8 

It 

II 

II 

It 

no 

Q 

It 

tl 

II 

2i20,  <22 

yes 

10 

M 

II 

II 

II 

no 

11 

II 

It 

It 

^22 

yes 

12 

tl 

It 

tt 

It 

no 

13 

HSG 

black 

3B 

<18 

yes 

14 

II 

II 

IT 

tl 

no 

15 

II 

II 

11 

:ilS,  <20 

yes 

16 

II 

l| 

II 

II 

no 

17 

II 

II 

It 

^20,  <22 

yes 

18 

II 

II 

It 

tl 

no 

19 

II 

II 

II 

:>22 

yes 

20 

II 

II 

!l 

no 
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Table  rj(cont'u) 


Male 


Category/ 

number 

Civilian 

education 

Race 

Bonus 

21 

HSG 

black 

4,5 

<18 

yes 

2"^ 

II 

It 

II 

II 

no 

23 

M 

II 

II 

‘-18,<20 

yes 

24 

M 

II 

It 

tt 

no 

25 

tt 

It 

II 

-20, <22 

yes 

26 

II 

If 

It 

It 

no 

27 

II 

II 

tt 

i:22 

yes 

28 

If 

It 

It 

tl 

no 

29 

HSG 

not  black 

1 , 2 , 3A 

<18 

yes 

30 

• f 

tt 

II 

tl 

no 

31 

It 

M 

tl 

^18,<20 

yes 

32 

II 

tl 

It 

ft 

no 

33 

II 

II 

tt 

2:20  ,<22 

yes 

34 

If 

It 

II 

It 

no 

35 

II 

II 

It 

*:22 

yes 

36 

It 

II 

II 

It 

no 

37 

HSG 

not  black 

3B 

<18 

%’es 

38 

39 

40 

41 

42 

43 

44 


18,  <20 
M 

20,  <22 

It 

-22 


no 

yes 

no 

yes 

no 

yes 

no 
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Table  n(cont'd) 


Male 

Categon.' 

number 


Civilian 

education 

Race 

Mental 

group 

Age 

Bonus 

use 

not  black 

4,5 

<13 

yes 

t • 

• 1 

II 

II 

no 

II 

M 

II 

i;18,  <20 

yes 

It 

II 

II 

II 

no 

M 

It 

II 

t:20.  <22 

yes 

1# 

1 1 

II 

II 

no 

It 

II 

II 

^ 22 

yes 

It 

M 

• 1 

II 

no 

OED 

black 

1,2,3A 

•-18 

yes 

II 

II 

II 

II 

no 

II 

II 

II 

::13,  <20 

yes 

II 

II 

II 

II 

no 

II 

II 

II 

^20,  <22 

yes 

It 

II 

• 1 

II 

no 

M 

II 

II 

i:22 

yes 

II 

11 

It 

II 

no 

GED 

black 

3B 

<18 

yes 

II 

II 

M 

II 

no 

II 

It 

II 

::18,  <20 

yes 

II 

II 

M 

II 

no 

II 

M 

M 

s20,  <22 

yes 

II 

II 

II 

II 

no 

II 

II 

II 

i22 

yes 

II 

II 

II 

II 

no 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 

61 

62 

63 

64 

65 

66 

67 

68 
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Table  13(Cont'd) 


Male 


Category 

number 

1 

Civilian  ( 

education  | 

Race 

Mental 

group 

Age 

Bonus 

b9 

GED 

black 

4,5 

<18 

yes 

70 

1 1 

II 

II 

II 

no 

71 

M 

II 

It 

V 

1^ 

00 

A 

0 

yes 

72 

II 

II 

• 1 

II 

no 

73 

II 

1 1 

II 

^20,  <22 

yes 

74 

M 

II 

M 

II 

no 

75 

It 

It 

II 

II 

II 

II 

i22 

It 

yes 

/O 


no 


77 

78 

79 
SO 
81 
82 

83 

84 


GED 


not  black  i,2,3A  <18  yes 

" no 

i 18,  <20  yes 
" no 

2:20,  <22  ves 


i22 


no 

yes 

no 


85 

86 

87 

88 

89 

90 

91 

92 


GED 


not  black 


3B 


<18 


yes 

no 


s 18 , <20  ycc 

" no 

i20,  <22  yes 

" no 

;>22  yes 

" no 
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Table  13 (cont' d) 


Category 

number 

Civili;m 

education 

Race 

Mental 

yroup 

1 

_ i Age 

Bonus 

9 3 

GED 

not  black 

4,5 

<18 

yes 

94 

M 

II 

It 

II 

no 

95 

• 1 

II 

II 

::  18,  <20 

yes 

96 

tt 

II 

It 

It 

no 

97 

tt 

II 

II 

i20,  <22 

yes 

98 

M 

It 

II 

It 

no 

99 

M 

tt 

II 

s22 

yes 

100 

It 

It 

II 

tt 

no 

101 

some  HS 

black 

1,2,3a 

<18 

yes 

102 

It 

It 

It 

tt 

no 

103 

tt 

It 

tt 

sl8,  <20 

yes 

104 

M 

II 

II 

tl 

no 

105 

It 

II 

1 1 

t:20,  <22 

yes 

106 

It 

II 

II 

tt 

no 

107 

ft 

ft 

1 1 

s22 

yes 

108 

It 

tt 

It 

II 

no 

109 

some  HS 

black 

3B 

<18 

yes 

110 

M 

tt 

It 

It 

no 

111 

tt 

II 

It 

;il8,  <20 

yes 

112 

II 

II 

II 

II 

no 

113 

It 

II 

It 

s20,  <22 

yes 

114 

It 

It 

tt 

tl 

no 

115 

It 

II 

tl 

^22 

yes 

116 

II 

tt 

It 

tt 

no 
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Table  H (cont'd) 


Male 


Category 

, Civilian 

Mental  1 

1 

numbe  r 

education 

Race 

group  1 

i 

Bonus 

117 

some  !1S 

b lack 

4,5 

<18 

yes 

1 18 

»l 

1 1 

II 

II 

no 

119 

1 1 

II 

It 

;il8,  <20 

yes 

120 

tl 

II 

II 

II 

no 

121 

It 

It 

II 

220,  <22 

yes 

122 

II 

It 

It 

tt 

no 

123 

II 

II 

II 

222 

yes 

124 

II 

II 

II 

II 

no 

125 

some  HS 

not  black 

1,2,3A 

<18 

yes 

126 

II 

II 

II 

II 

no 

127 

It 

II 

It 

2 18,  <20 

yes 

128 

II 

It 

II 

n 

no 

129 

M 

II 

II 

220,  <22 

yes 

130 

II 

II 

tl 

tt 

no 

131 

M 

II 

1 

222 

yes 

132 

It 

II 

II 

It 

no 

133 

some  HS 

not  black 

3B 

<18 

yes 

134 

II 

M 

II 

It 

no 

135 

tl 

tl 

II 

218,  <20 

yes 

136 

It 

If 

II 

II 

no 

137 

If 

It 

tl 

220,  <22 

yes 

138 

ft 

II 

II 

tl 

no 

139 

ft 

If 

II 

222 

yes 

140 

M 

II 

II 

tl 

no 
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Table  13  (coat'd) 


Male 


Category 

number 

1 i:ivilian 

education 

Race 

Mental 

group 

Age 

Bonus 

141 

some  HS 

not  black 

4,5 

<18 

yes 

142 

tt 

If 

1 1 

1 1 

no 

143 

144 

145 

146 

147 

148 


:>  18,  <20 
M 

s20,  <22 

M 

::22 


yes 

no 

yes 

no 

yes 

no 


149 

150 

151 
132 

153 

154 

155 

156 


no  HS 


black 


1,2,3A 


<18 

n 

il8,  <20 

M 

t:20,  <22 


yes 

no 

yes 

no 

yes 

no 

yes 

no 


157 

158 

159 

160 
161 
162 

163 

164 


no  HS  black 


3B 


<18 

M 


s 18,  <20 

M 


s20,  <22 

l« 


i22 


M 


yes 

no 

yes 

no 

yes 

no 

yes 

no 
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Table  O ( coni' d) 


. d 
Ma±e 


Catego ry 
numbe  r 

Civilian 

education 

Race  ! 

wm 

Bonus 

165 

no  IIS 

black 

4 ,5 

<18 

yes 

16b 

M 

n 

II 

no 

lo7 

tl 

II 

i 18,  <20  yes 

168 

II 

• 1 

II 

no 

169 

II 

If 

s20,  <2- 

yes 

170 

If 

If 

It 

no 

171 

If 

If 

s22 

yes 

172 

M 

M 

If 

no 

173 

no  HS 

not  black 

1,2,3A 

<18 

yes 

17A 

II 

II 

If 

no 

175 

If 

II 

S;  18,  <20  yes 

176 

II 

f 1 

ff 

no 

177 

II 

II 

^20,  <22 

yes 

178 

If 

II 

II 

no 

179 

II 

If 

2:22 

yes 

180 

If 

II 

II 

no 

181 

no  HS 

not  black 

3B 

<18 

yes 

182 

It 

It 

II 

no 

183 

It 

II 

2 18 , <20  yes 

184 

If 

If 

II 

no 

185 

If 

II 

220,  <22 

yes 

186 

If 

If 

It 

no 

187 

If 

II 

222 

yes 

188 

II 

If 

If 

no 
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Table  13  (coni' d) 

Male 


Category  ^ 

n umbe  r ! 

Civilian 

education 

! Race 

mm 

Bonus 

189 

no  HS 

not  black 

<18 

yes 

190 

M 

If 

If 

II 

no 

191 

M 

M 

•1 

.cl8,  <20 

yes 

192 

M 

It 

It 

II 

no 

193 

M 

It 

It 

i20,  <22 

Yes 

19A 

t» 

It 

tl 

It 

no 

195 

It 

It 

It 

£:22 

yes 

196 

M 

II 

It 

tt 

no 

It  is  assumed  that  all  females  are  true  volunteers,  either  HS 
graduate  or  GED,  mental  group  1 or  2,  and  no  bonus. 

*^Hlgh  school  graduate. 

General  education  development. 

^Volunteers  for  draft  and  R,\. 
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Table  !A 


CATECORIi:S  FOR  TRUE  VOLUNTEER  OTS  CAINS  FREQUENCY  DISTRIBUTION 

(Population  Partition  2) 


Category 

j Civilian  j 

Mental  . 

Term  of 

numbe  r 

education  i 

Race 

group  ' 

enlistment 

197 

HSG 

black 

1 , 2 , 3A 

198 

II 

It 

3B 

199 

(I 

II 

4.5 

200 

It 

not  black 

1,2,3A 

201 

II 

II 

3B 

202 

It 

II 

4,5 

203 

GED 

black 

1.2.3A 

204 

• 1 

II 

3B 

205 

II 

II 

4,5 

206 

It 

not  black 

1,2,3a 

207 

II 

II 

3B 

208 

II 

II 

4,5 

209 

not  HSG  or  GED 

black 

1,2,3A 

210 

II 

II 

3B 

211 

• 1 

II 

4,5 

212 

II 

not  black 

1,2,3A 

213 

II 

If 

3B 

214 

II 

II 

4,5 

215 

HSG 

black 

1,2,3A 

216 

II 

It 

3B 

217 

II 

II 

4.5 

218 

M 

not  black 

1,2,3A 

219 

It 

•f 

3B 

220 

II 

11 

4,5 
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Table  14  ( con c ' d) 


Category 

number 

Civilian  j 

education  ! 

Race  ' 

Mental  \ 
group  ! 

Term  of 
enlistment 

Gi£D 

black 

1,2,3A 

3 

•TOO 

M 

It 

33 

It 

? ■>  3 

It 

It 

4,5 

M 

Ilk 

It 

not  black 

1,2,3A 

It 

225 

It 

II 

3B 

It 

126 

It 

It 

4,5 

II 

227 

not  HSG  or  GED 

black 

1,2,3A 

3 

228 

It 

It 

3B 

1 1 

229 

tt 

It 

4,5 

II 

2 30 

tt 

not  black 

1,2,3A 

II 

231 

It 

II 

3B 

It 

2 32 

It 

1 1 

H ,5 

II 

233 

HSG 

black 

1,2,3A 

4,5,6 

2 34 

II 

II 

3B 

It 

2 35 

tt 

It 

4,5 

It 

2 36 

tt 

not  black 

1,2,3A 

It 

237 

tt 

tt 

3B 

tt 

238 

II 

tt 

4 ,5 

tl 

2 39 

GED 

black 

1,2,3A 

4,5,6 

240 

It 

M 

3B 

It 

241 

tt 

tt 

4,5 

tl 

242 

tt 

not  black 

I,2,3A 

If 

243 

tt 

tt 

3B 

It 

244 

tt 

tl 

4,5 

tt 

61 


I 

\ 


Tab  Le  i'-t  ( cun  C ' d) 


Cacesory  ' 

number 

Civilian  ) 

education  1 

Race  ^ 

Mental  i 

Rroup 

Te  rm  o f 
enlis  tmen  t 

243 

not  HSG  or  CED 

black. 

1,2,3A 

4,5,6 

246 

M 

II 

3B 

II 

247 

M 

II 

4,5 

II 

248 

It 

not  black 

1 , 2 , 3A 

II 

249 

tt 

It 

3B 

II 

250 

It 

If 

^,5 

II 

i 

I 


Regression  output; 

1)  TAPE  15  (MTIO) 

2 ) TAPE6 

3)  TAPE30  (MT12) 


Data  structure  inr'ormation  anil  forecasts 
for  eacti  regression  run,  to  he  used  by  the 
aggregation  program. 

Printer  file  with  complete  regression  results. 

Date,  data  structure  information,  historical 
and  forecast  data  for  tlie  graph  program. 


Input  Parameter  Card  Description 

Six  card  types  are  used  for  proper  initiation  of  the  preprocessor 
routine.  Five  are  required  and  one  is  optional. 


Card  type  i ‘ 1 
Card  type  if! 


Card  type  <13 
Card  type  f/4 


Must  precede  ail  other  input  cards  and  contains  base  yr/mo  data, 
title  for  the  run,  optional  'F'  levels,  and  output  controls. 
Contains  information  a.,  to  which  t.ible  will  be  in  use 
for  run,  and  the  name  of  one  of  the  dependent  variables. 

It  also  contains  fields  for  the  number  of  subproblems, 
the  number  of  variables  added  by  transgeneration  and  the 
number  of  transgeneration  cards,  if  any. 

Tills  card  contains  the  series  numbers  that  will  be  combinea 
to  form  the  dependent  variable  defined  by  card  type  >/2. 

This  card  simply  says  'FN'D'  and  designates  the  end  of 
a section.  It  shows  the  end  of  the  dependent  variable 
section,  or  the  end  of  each  group  of  independent  variables 
selected  for  use  in  conjunction  with  each  dependent  variable. 
Contains  the  name  of  an  independent  variable,  along  with 
information  about  its  effect  on  the  regression  equation, 
i.e.,  where  the  variable  can  enter  the  equation  with 
positive  or  negative  coefficients,  and  what  lags  the 
variable  may  have. 

This  card  is  optional  and  is  only  used  for  transgenera- 
tion of  variables.  It  is  used  in  conjunction  with  card 
type  #2 . 

Table  15  gives  the  layout  for  the  required  cards  and  Table  16  for  t’.ie 
optional  cards.  Figure  11  shows  the  order  of  the  input  cards. 


Card  type  #5 


Card  type  </6 
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PREPROCESSOR  INi’UT  PARAMETER  CARD  i.AYODT 


I 


00 

i; 

. 

cc 

15 

•.M 

u 

U 

m 

—4 

• 

u 

00 

• 

75 

w 

rz 

r-v 

z; 

• 

Z) 

4-> 

4J 

«-i 

c 

0 

C 

0 

75 

c 

u 

o 

u 

• pH 

75 

•pH 

oc 

0 

0 

p~< 

yPPK 

w 

^.. 

c 

15 

Ch 

<5 

05 

Ut 

c 

> 

• pH 

o 

oc 

OC 

JO 

00 

'pH 

oc 

w 

sJ 

pC 

• 

IJ 

3 

D 

»c 

•pp 

c 

c 

Xi 

Zj 

•J 

LJ 

3 

3 

> 

0 

Ut 

5p< 

3 

"3 

PJ 

r-^ 

c 

3 

V-P 

•-4 

rr 

w 

ij 

>s 

p3 

7} 

3 

75 

75 

vu 

Q. 

U 

3 

w 

a 

•PH 

15 

C 

OJ 

P3 

Q 

05 

u 

<; 

■u 

p: 

c 

4-» 

-3 

1— < 

zz 

zz 

w 

•r-» 

P3 

•p^ 

15 

15 

P3 

'J5 

'T3 

0 

'■M 

T> 

75 

•pH 

• 

0 

'—i 

>% 

u> 

pC 

"O 

*J 

•J 

n: 

0 

c 

0 

CJ 

'w' 

c 

•pH 

•W ' 

pC 

-I) 

3. 

IIh 

-C 

3 

-u 

o 

O* 

c 

C 

cU 

00 

•fH 

15 

•0 

c 

7) 

u 

•PH 

75 

U 

3 

•pH 

a 

*f*4 

75 

U4 

75 

u-< 

B 

c 

a 

Z) 

15 

3 

0 

u 

U 

rH 

s 

fQ 

00 

W 

C 

00 

xj 

■U 
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Fig.  11 — Preprocessor  Input  Parameter  Sequencing 
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Detailed  Pescr iption  of  Dei<end ent  Variable  Card  Type  2 

1.  Field  position  2 (dependent  variable  name) 

This  eight  character  field  contains  the  name  which  the  user 
assigns  to  the  dependent  variable  which  he  creates.  Tills  name  is 
carried  throughout  tiie  regression  routine  and  all  subsequent  reference 
to  any  dependent  variable  is  via  this  name 

2.  Field  position  3 (subproblem  number) 

This  number  denotes  the  number  of  times  the  stepwise  regression 
subroutine  is  initiated  for  that  particular  dependent  variable.  Referring 
to  Fig.  11,  it  is  necessary  to  insert  complete  sets  of  independent  vari- 
able cards,  back-to-back,  equal  to  the  number  specified  in  this  field. 

If  this  fact  is  overlooked,  the  correspondence  between  the  dependent 
variables  and  the  sets  of  independent  variables  will  be  out  of  sequence. 

3.  Field  position  4 (number  of  transgeneration  cards) 

This  field  should  be  zero  filled  except  where  transgeneration 
cards  are  to  appear  in  the  regression. 

4.  Field  position  5 (number  of  variables  added  due  to  transgeneration) 
Some  transgeneration  codes  do  not  create  new  variables,  but  simply 

alter  existing  independent  variables,  in  which  case  this  field  should  be 
zero.  (Refer  to  Appendix  B for  details  on  transgeneration  codes). 

Detailed  Description  of  Series  Card  Type  3 

1.  This  card  is  broken  into  eight  10  character  numeric  fields.  Each 
of  the  eight  fields  is  again  broken  into  two  5 character  fields. 

2.  To  use  one  of  these  10  character  fields  as  an  example,  assume 

that  for  a particular  dependent  variable,  the  following  series  numbers 
are  to  be  selected:  1 to  6,  12,  37,  55  to  60,  65,  and  70  to  73. 

Rather  than  require  to  user  to  punch  1,  2,  3,  4,  5 and  6 into  speci- 
field  fields,  the  range  may  be  specified,  as  shown  here. 

j 1 ^1  or  I 5 5 6 0 I , etc. 

1 2 3 4 5 6 7 8 9 10  11  14  15  19  20 

To  specify  a single  element  of  a table  the  user  would  simply  place  that 
series  number  in  the  second  five  digit  field  (right  justified),  i.e., 

12 

123456789  10 
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A blank  in  the  fifth  position  of  each  10  ei\aracter  field  indicates 
that  it  is  not  a series  of  elements,  but  a single  element. 

Tills  example  could  be  placed  on  a single  series  card.  When  a 
second,  third  or  fourth  card  is  required,  it  is  imperative  that  the 
tenth  position  of  each  10  character  field  be  filled,  otherwise  the 
scanning  terminates  and  the  program  assumes  the  end  of  the  series. 

Detailed  Description  of  Independent  Variable  Card,  Type  5 

1.  The  first  data  field  contains  the  name  of  ,in  Independent  vari- 
able data  file.  This  name  must  agree  with  a name  in  the  table  of  names, 
listed  on  every  run  of  the  preprocessor.  Refer  to  App  6 for  the  correct 
abbreviated  spellings  of  these  names. 

2.  Field  position  2 requires  a 'P',  '!i',  or  a blank.  'P'  or  'M'  forces 
th.e  sign  of  the  coefficient  of  tne  entering  variable  such  chat  it  can  only 
have  a positive  or  negative  effect  on  the  regression  result.  If  a variable 

is  forced  to  have  a positive  coefficient  and  it  would  otherwise  have  a sig- 
nificant negative  effect  in  the  regression,  the  algorithm  will  exclude  it 
from  the  regression. 

3.  Field  position  3 must  contain  the  number  of  variables  from  the  same 
family  of  variables  that  can  be  in  the  regression  equation  at  any  one  time. 

Two  variables  belong  to  the  same  family  if  they  were  created  from  Che  same 
original  independent  variable  by  means  of  lagging.  The  original  variable 
also  belongs  to  the  family, 

A.  The  fourth  field  position  through  the  16th  is  used  in  cases  where 
an  independent  variable,  such  as  a program  variable,  has  a lagging  effect. 

The  first  of  the  sixteen  fields  represents  the  maximum  lag  (or  shift)  that 
a variable  may  possess.  For  each  position  of  these  fields  which  is  filled, 
a new  variable  will,  essentially,  be  created.  .An  illustration  will  help 
to  clarify  these  data  fields. 

Variable  name  Lag  #1  Lag  f)2 

. i 1 * 

PAIDTV  1-2  -1 

! 1 I 

j Maximum  number  that  | 

! enter  regression  I 


69 


In  this  example,  three  variables  would  be  generated.  The  first  would 
have  a lag  of  2 months,  the  second  a lag  of  I month  and  the  third  would 
be  the  original  variab le,  which  is  included  automatically.  During  the 
course  of  the  regression,  only  one  of  these  variables  could  enter  with 
either  a positive  or  negative  coefficient. 

Note:  In  order  to  properly  align  the  data  bases,  the  preprocessor 

searches  all  independent  variable  data  cards  and  stores  the  maximum  shift 
of  all  variables,  and  adjusts  the  starting  point  of  the  data  bases.  A 
maximum  shift  of  two  raontlis  would  shorten  the  time  span  of  a data  base 
by  two  months. 


generated 

variables 


Month  1 


"l 

^'2 

"3 

"4 

V V 
5 6 

00 

> 

j 1 

1 

1 

1 

10 ! 

1 

1 ; 

1 ; 

1 1 

1 

j 

! 

20  i 

i 

1 

j 1 

1 ! i 

1 

1 

1 i 
1 

30  ^ 

Vi 

00 

> 

1 no  j 

no 

data  1 

data  ' 

no  j 

1 no 

data  ; 

data  ■ 

20 

10  ' 

3 new  starting  point 


Therefore,  labels  generated  would  appear:  IV  , 2V  , 3V-...8V  L.,,  9V  L,  , 

1 2,  j o A.  o I 

lOVgL^. 

Sample  input  parameter  listing  1: 

This  example  shows  the  relationship  of  each  dependent  to  its  indepen- 
dent variable  string. 


Tar  1 

7 1 *1 

r at^s 

»*XccT  PUN** 
1 

C Q 

CC 

1 

TAPI 

u 

p -M  s - 1 - 3 

c c-MflL  ft 
1 

0 0 

OQ 

Tar  1 

MnLP,qLar>^,  nro,  CAT 
1 

P 3 

30 

?1 

TO 

*^aLT»PLar'<r,  “TO, cat 

4,5 

★ 

Where  V'  L_,  V„L,  and  V'  L„  stand  for  variable  V„  lagged  bv  zero,  one 
o u o 1 o z o 

and  two  months,  respectively. 
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Sample  input  parameter  listing  2; 

This  example  demonstrates  the  placement  oC  transgeneration  cards 
in  a deck  set-up.  They  always  precede  independent  variables  for  the 
corresponding  dependent  variable. 

j 1 2 2 /’I  * I ; I ’ 

TITLE 

i J 1 

: . / . < . . ^ : 

t = . .2.  . 

L". 

Ti’OPTM  -1 

CiO-'^TE  ' i - 

E‘lj 

In  this  simple  example,  there  is  only  one  dependent  variable.  There 
are  five  basic  independent  variables  with  six  more  generated  through  lags. 

The  transgeneration  cards  will  also  create  an  extra  variable.  The 
dependent  variable  card  specifies  that  there  are  two  transgeneration  cards, 
and  that  one  new  variable  will  be  added,  due  to  transgeneration. 

Tlie  formats  of  the  TRNGEN  cards  are  defined  in  App  B.  The  indices 
referred  to  correspond  to  those  assigned  by  BMD02.R  as  follows  with  the  lags 
indicated  in  parentheses: 


Index 

Variable 

1 

dependent  variable 

2 

RECR 

3 

RECASS(-2) 

4 

RECASS(-l) 

5 

RECASS(O) 

6 

U0CCAN(-2) 

7 

UOCCAN(-l) 

8 

UOCCAN(O) 

9 

TYOPTN(O) 

10 

CA0PTS(-2) 

11 

CA0PTS(-2) 

12 

CAOPTS(O) 
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In  the  first  TRNCEN  card,  08  denotes  the  additive  operation,  07 
the  variable  to  be  replaced,  06  the  variable  operated  on,  and  3.0 
a constant.  Synbol ica 1 Iv , this  is  expressed  bv : 

X,  + 3.0  ► X., 

where  X,  and  X.  refer  to  U0CCA.N(-2)  and  U0CC.\.N (- 1 ) , respectively.  In 
b / 

the  second  TRXGE.N  card,  09  denotes  the  multiplicative  operation.  Symboli- 
cally, this  is  e.xpressed  by 


where  X,  is  as  before  and  .X,^  is  a new  variable. 

0 1 3 

POST  PROCESSOR — 0R.VPH  PR0GR/\M  (PROGIGVM  #A) 

This  routine  is  dependent  upon  the  successful  completion  of  the  pre- 
processor program,  to  ensure  creation  of  a file  containing  input  data  to 
this  routine.  In  addition  to  this  file,  a minimum  of  3 input  parameter 
cards  are  required. 

The  output  of  the  program  is  a plot  for  each  dependent  variable 
selected  for  use  in  the  preprocessor,  as  well  as  a deviations  table  list- 
ing the  dependent  variable  values,  regression  equations  estimates  (Y) , 
the  residual  error  based  on  the  two  preceding  quantities,  and  the  percent 
error. 

All  plots  are  assigned  to  an  output  file,  as  are  the  deviations  tables, 
but  the  tables  are  only  generated  if  requested  in  the  parameter  cards.  A 
schematic  of  the  post  processor  including  both  the  graph  (program  '4)  and 
the  aggregating  (program  i/5)  programs  is  given  in  Fig.  12. 

Input  Files 

TAPEl  (MT12).  This  file  contains  binary  data  written  from  the  regression 
routine.  It  consists  of  data  length,  starting  and  ending  dates,  time-series 
data  for  the  dependent  variable,  and  projections. 

TAPES . This  is  the  parameter  card  file  and  contains  control  information, 
such  as  today's  data,  whether  graphs  are  to  be  printed,  tape  number  of  input 
data,  name  given  to  the  run,  and  end  of  data  identifier. 


1 


Output  Files 

TAPEIO  (MTI4) . Contains  tha  graphs.  This  allows  the  graphs  to  be 
retained  for  later  printing,  or  if  graphs  are  not  desired,  to  be  ignored. 

TAPE6 . This  is  the  standard  printer  file  and  will  contain  titles, 
input  par;imeter  specifications,  and  deviations  tables. 

Input  Card  Formats  and  Specifications 

1 

Program  * 


Card  -f 

CCit 

Format 

name 

Description 

1 

1-12 

3A4 

DATE 

Today's  data  (in  any  form). 

2 

1-2 

12 

IG 

Place'99'  if  only  graphs  are 
desired,  otherwise  leave  blank. 

6-10 

15 

ITrVPE 

Tape  If  of  input  data,  if  other 
than  TAPEl. 

3 

1-80 

20A4 

IAS SUM 

Any  alphanumeric  description  of 
this  run.  User  can  use  as  many 
cards  here  as  he  wishes. 

n 

1-4 

14 

IH9 

Must  contain  9999  to  indicate  end 
of  input  data,  where  n is  the 
number  of  data  cards. 

The  following  is  a listing  of  sample  input  data. 

Map  ? 1 Q 7 f, 

1 

TE^T  TC’  ®0UTT^''■ 

qqqo 

POST  PROCESSOR — ^AGGREGATION  PROGRA:-!  (PROGRAM  #5) 

This  routine,  which  is  the  last  in  the  series  of  the  NPSGM,  aggregates 
the  dependent  variables  selected  for  use  in  the  preprocerror  program  into 
■i  3-ixlnun  of  four  C-groups,  Generated  within  this  program  are  aggregated 
k'.  efficients  and  consolidated  historical  and  projection  data  for 
• ■ - -groups  defined  by  the  user. 


J 


• •Is 


Tlie  outputs  of  the  progr.im  consists  of  (a)  one  file  containing  12- 
raonth  forecasts  (e;<pressed  in  thousands)  beginning  on  the  first  nK>ntIi 
after  the  end  of  the  data  base  and  (b)  a puncli  file,  used  as  input  to 
the  C0MPL1P-G2  model.  The  punch  file  contains  the  seasonal  coefficients 
associated  witli  each  month  of  the  forecast  period  for  each  C-group . 

Inputs  to  the  program  consists  of  a file  containing  projected  tiine- 
^ series  data  and  input  parameter  cards.  The  input  file  contains  projections, 

’ organized  in  a similar  manner  as  the  dependent  variable  file  in  that  there 

I are  as  many  12-month  groups  of  projections  as  there  are  dependent  variable 

series  on  the  first  file.  The  card  input  consists  of  four  types  of  cards. 

^ They  are:  (1)  the  title  card,  (2)  user  supplied  estimates  of  accessions 

for  the  12  future  months  of  population  groups  that  are  not  included  in  the 
regression  results,  but  are  desired  to  be  included  in  the  aggregations 
for  the  C-groups  to  be  usea  in  C0MPLIP-G2,  (3)  control  cards  indicating 
the  aggregations  to  be  made,  and  (4)  the  END  terminator  card. 

Of  a maximum  of  20  possible  dependent  variables  coming  from  the  pre- 
processor, a total  of  four  new  C-groups  may  be  created  through  aggregation. 
The  program  assumes  that  all  variables  coming  in  will  be  used  in  one  of  the 
C-groups,  although  it  is  not  a fatal  error  if  one  is  omitted.  The  use  of 
• any  dependent  variable  in  more  than  one  C-group  will  result  in  a fatal 

diagnostic  and  a terminated  run. 

^ This  aggregation  is  done  by  listing  the  series,  coming  from  input, 

that  are  to  be  aggregated  into  each  of  the  new  C-groups,  therefore,  a 
knowledge  of  the  preprocessor  output  is  essential  for  meaningful  aggregation. 

f 

I If  the  population  groups  defined  by  the  dependent  variables  from  the 

regression  run  do  not  span  the  entire  population,  then  the  user  may  input 

* 

additional  groups  (similar  to  those  from  the  regression  program)  by  means 
of  card  input.  The  index  numbers  assigned  to  the  user-supplied  groups  will 
follow  those  used  for  the  dependent  variables,  and  the  total  must  not  exceed 
20.  For  example,  if  seven  dependent  variables  were  used  in  the  regression 
^ run  (each  defining  a unique  population  group),  then  the  first  user-supplied 

group  would  have  the  index  eight  assigned  to  it.  The  indices  are  used  in 
defining  the  aggregations. 

I The  card  formats  are  as  follows: 


r 

1 
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Title  C.ird,  Card  Type  1 - FOR>UT  (15A4) 

Tills  input  consists  of  one  card  with  tlie  title  to  he  printed  on 
tiie  reports. 

User-Supplied  Population  Groups,  Card  Type  2 - FOivMAT  (A3,  F3 . 2 ) 

Hiis  input  consists  of  12  cards  for  each  user  supplied  population 
group,  one  card  for  each  of  the  projection  months  defined  by  the  regression 
program.  Within  each  group  the  cards  must  follow  the  same  seqiience  projected 
Wv  the  regression  program.  ITiat  is,  if  Clie  first  projection  month  is  July, 
then  the  cards  are  ordered  as  follows:  July,  August,  September ,... ,May , 

June,  for  each  group,  one  group  following  another  group  in  succession.  The 
last  card  of  the  last  user-supplied  group  is  followed  by  an  END  card  (card 
type  4).  Tlie  END  card  is  read  with  the  A3  format. 

C-Group  Definition.  Card  Type  3 - FORIUT  (A3,  7X,  7 (13,  7X) 

The  use  of  as  many  cards  as  are  necessary  to  list  ail  of  the  dependent 
variables  as  user-supplied  population  groups  that  comprise  each  C-group  is 
permitted  until  a maximum  of  20  is  reached  or  an  END  card  is  encountered. 

Each  set  of  cards  defining  a C-group  is  followed  by  an  END  card.  The  END 
card  is  read  with  the  .A3  format. 

END,  Card  Type  4 

Tliis  is  a delimiter  card.  The  word  END  is  punched  in  the  first  three  card 
columns.  The  card  is  read  with  the  format  statements  for  card  types  2 and  3. 
The  following  is  a listing  of  a sample  input  data. 


Tr  0 
100 
?00 
TOC 
h 00 
«^0  0 
<^00 
7C0 
»^00 
OCT 

10  CO 

lino 
! ?oo 

rfin 


“ !>,  nr.p 
0 0 

0.0  0 
0.00 

0.0  0 
J .0  0 
1 ^ 

0 .m 
0.0  0 
0.00 
0.00 
■!  n 0 


=T,!iTT0M  on,.i-jMr  i ncocvnc-f,jT  \/a=>IATL“S 


Input  for  user  defined  group, 
group  8 


J 


r*jn 


c 
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Figure  13  gives  a scliematic  of  the  data  card  setup. 


C- group  3 


C-group  2 , 


C-group  1 


; END 


13  5 6 


Termioator 


! 12  cards  for 
* each  user 
defined 
f group 


f 


User  defined  group 


User  defined  group  1 


Title  card 


Fig.  13 — Example  of  Input  Cards  for  Aggregation  Program 


In  this  case,  7 dependent  variables  came  from  the  input  file  and  two 
population  groups  were  ■input  by  cards. 
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Input  Files 

TAPE2  (MTIO).  This  is  a BCD  file  containing  the  forecasts  of  the 
dependent  variables  that  are  to  be  aggregated. 

TAPE5 , This  is  the  parameter  card  file  containing  control  informa- 
tion supplied  by  the  user. 

Output  Files 

T/\PE10  (MT13) . This  file  contains  the  aggregated  C-groups  in  the 
format  required  by  the  C0MPLIP-G2  matrix  generator  as  input  file  TAPE17. 

T.\PE7 . TTiis  file  should  be  assigned  to  the  system  punch.  It 
contains  the  seasonal  coefficients  for  use  by  the  C0MPLIP-G2  matrix 
generator . 
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Appendix  A 

LISTING  OF  INDEPENDENT  VARLVBLE  TIME  SERIES 
(For  definitions  see  Table  7 of  Chap.  2) 


Note : Each  time  series  is  given  in  the  following 

four  columns: 

1.  variable  identifier 

2.  n»nth 

3.  calendar  year 

4.  monthly  value 


A-1 
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SOURCES  OF  INDEPENDENT  VARIABLES 


The  sources  are  cited  for  those  independent  variables  where  it  is  not 
obvious  (e,g.,  iw)st  policy  variables  reflect  Department  of  the  Armv  policies; 
no  sour..e  is  cited  for  them). 

1.  USAREC  Req  08O-I  is  the  source  for  the  timing  of  tlu*  different  options 
and  the  total  options  available  to  a recruit  in  anv  given  month.  Tills  is 
the  source  for  the  following  variables: 

CAOPTS,  OPTSTO.  TYOPT,  FYOPTN 

2.  N.W.  Ayer  Co.,  the  Arnr.-  advertising  agencv  is  the  souri  e for  the 

print  media  variable:  PRTMED. 

3.  "Effectiveness  of  the  .'■kidem  Volunteer  Armv  Advertising  Progr  im." 

prepared  bv  Stanford  Research  Institute  for  OSA.MVA,  Uecemoer  197'..  is 
the  source  for  the  paid  P.'  variable:  PAIDTV*. 

♦.  "Emplovment  and  Earnings,"  Bureau  of  Labor  Statistics,  February  1973 
is  the  source  of  the  seasonal  adjustment  f.ictors  ,md  the  unemplovmtnt  rates, 
rhe  seasonal  adjustment  f.ictors  for  in- 19  year  old  males  were  applied  to 
the  unenB’loyment  data  on  lb-21  year  old  males  from  Tables  A-5  .ind  A-7 
of  the  s.iroe  monthly  publication  to  derive  the  monthly  estim.ites  of  season- 
ally adjusted  linemplovment  rates:  DUNEMP. 
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Appendix  B 

BMD02R  TRiVNSGENERATION  CARDS 

(Reproduced  from  Reference  5, 
pages  15-21) 


I 


B.  T ransgeneration  Cards 

The  term  transgeneration  is  used  to  include  transformations  of 
input  variables  and  creation  of  new  variables  prior  to  the  normal 
computations  performed  by  the  various  programs. 

The  transformations  described  below  are  performed  on  the  values 
of  the  variables  in  each  case.  In  these  examples,  the  symbol 
will  denote  the  i^^  variable  as  well  as  its  value. 


Examples: 


^4 

X > 

^1 

Q 

Xg  replaces  X^ 

^ ^ 

X^+X3-^ 

^2 

X^  + X3  replaces  X^ 

1 

By  successive  transformations, 
may  be  obtained.  For  example: 

more  complicated  relationship 
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(i)  To  replace  by  y 
required: 


T ransformation 


+ X^  four  transformations  are 


Variables  as  they  are  stored  at 
each  step 


^^1 

^2 

^^3 

X. 

D 

x,^ 

X, 

X. 

X, 

1 

2 

3 1 

4 

5 

X ^ 

X," 

x^ 

Xc 

1 

2 

3 

4 

5 

X,' 

X " 

X. 

2 2 
X,  +x. 

1 

2 

3 

4 

1 3 

2 

X, 

X," 

X, 

t/x  ^+X  ^ 

2 

3 

4 

y 1 3 

In  this  example,  it  can  be  seen  that  the  original  values  of  Xg 
are  irrelevant.  Actually  the  variable  X^  may  be  a dummy 
variable  introduced  by  the  program  specifically  to  provide 
capacity  for  creating  new  variables  by  transgeneration. 
Dummy  variables  may  be  required  for  intermediate  storage 
in  order  to  effect  some  transformations. 

2 

(ii)  To  replace  X^  by  exp  (-1/2  X^  ) three  transformations  are 
required: 


T ransformation 

^1 

^2 

2 

^1 

^2 

-1/2  X^ — Xj 

-1/2  X^^ 

^2 

=^3 

exp  (X^)— > X^ 

exp  (-1/2  X^^) 

^2 

=^3 

(iii)  To  replace  X by  X_  + log  (X  - X + 100)  four  transformations 
are  required: 


Transformation 
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Transformation 


X-  + X^  — >x, 

2 4 4 


X. 

X, 

X, 

X, 

I 

2 

3 

4 

^2 

^1 

^2 

^3 

V°SlO<^4-V 

The  transformations  are  performed  in  the  order  in  which  the 
Transgeneration  Cards  appear,  so  that,  for  example,  the  two 
transgenerations  2X^ — ► X^  followed  by  X^-2  — >X^^  will  result 

in  2Xj  - 2,  whereas  ^X^ — ► X^  will 

result  in  2(X^-2). 

TRANSGENERATION  LIST 

Notation  to  be  used  in  the  following  transgeneration  list; 

i,  j,  k are  variable  indices  (need  not  be  different) 
c is  a constant 

a^,  a^,  a^,  ...  are  constants 

n is  the  number  of  cases,  or  sample  size 
n 


The  mean  X.  = ^ 


• =-  Z X.. 

^ ^ jil 


The  standard  deviation  s 


1/2 


Code 

01 

02 

03 

04 

05 


Transgeneration 


V3T- 


Restriction 


X.  > 0 
1 ” 


-\/x“  +vx”+r 

X, 

lu  1 k 
X; 


®10  1 


arcsin 


in  V^T- 


X.  > 0 
1 


X.  > 0 
1 


0 < X.  < 1 
“ 1 “ 
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Code 

T ransgeneration 

Restriction 

21 

cos  X.  X, 

1 k 

- 

22 

arctan  X. X 
1 k 

- 

23 

X/^j-)X, 

1 k 

X.>  0 

1 

24 

C ^ -3*X, 
k 

c > 0 

25 

X.-^  X, 

1 k 

- 

26 

c X 

k 

(Leave  code  i bla.nk) 

27-39 

Not  defined 

40 

If  X.  = a,  or  a_  or  a,  . . . 
1 1 2 3 

, a,,  then  c X,  ; 
7 k 

otherwise  X remains  unchanged. 

41  U X.  is  blank,  then  c X (X.  4 -0)=:= 
1 k 1 


otherwise  X^  remains  unchanged, 

*Note  that  in  reading  numeric  fields,  a blank 
field  and  -0  are  equivalent, 

4Z  If  X.  = a,  or  a,,  or  a . . . , a , then  X.^X  ; 

1 1 2 3 ^ J k 

otherwise  X remains  unchanged. 

iv 

43  If  X.  is  blank,  then  X.^  X,  ; (X.  4 -0) 

a j k 1 

otherwise  X,  remains  unchanged, 
k 

When  a violation  of  a restriction  in  the  right-hand  column  occurs  during 
transgeneration,  the  program  will  print  a diagnostic  message.  Most 
programs  will  proceed  to  the  next  problem,  if  any.  Some  programs  will 
delete  the  case  where  the  violation  occurred  and  continue  the  com.putation  , 
Other  programs  will  screen  all  the  input  data  for  additional  restriction 
violations  before  proceeding  to  the  next  problem,  if  any. 


1.  Standard  Transgene ration  Cards 

Standard  Transgcncration  Cards  are  used  with  programs  which 
use  Standard  Data  Input  (see  Section  II-C).  Li't  p denote  the 
number  of  variables  in  the  data  matrix  and  m the  maximum 
number  of  variables  allowed  by  the  program  for  any  problem. 

Any  of  the  variables  Xj,  ....  x,^  maybe  used  in  transgencration. 
The  initial  values  of  the  first  p variables  are  read  from  the 
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input  data  file  (Data.  Cards  or  Alternate  Input  Tape),  The  initial 
values  of  the  remaining  m-p  variables  are  left  over  from  previous 
calculations.  After  transgeneration  of  a particular  case,  the 
values  of  the  first  p+q  variables  for  that  case  are  used  as  the 
values  of  tlie  transgenerated  variables.  If  tlie  p+q  variables 
required  for  the  computation  are  not  the  first  p+q,  they  must  be 
relocated.  This  may  be  done  by  using  transgeneration  code 
number  25.  The  numbers  p and  q (q  may  be  positive,  negative,  or 
zero)  are  specified  on  the  Problem  Card,  The  indices  i,  j,  and  k 
from  the  transgeneration  list  may  exceed  p or  p+q  but  must  never 
exceed  m. 

Card  Preparation 


Col. 

1-6 

TRNGEN  (Mandatory) 

Col. 

7-9 

Variable  index  k 

Col. 

10,  1 1 

Code  from  transgeneration  list  (restricted  by 
availability  in  particular  program) 

Col. 

12-14 

Variable  index  i 

Col. 

15-20 

Variable  index  j or  constant  c 

Col, 

21-25 

Blank 

Col. 

26 

Number  of  a.'s  for  transformation  40  or  42 
1 

Col. 

27-32 

a^  value 

Col. 

33-38 

a^  value 

- 

Col. 

63-68 

a.^  value 

The  constants  c,  a^,  . . . , are  punched  with  a decimal  point  if 
used  with  variables  which  have  an  F-type  format  and  without  a decimal 
point  if  used  with  variables  which  have  an  I-type  format  (see 
Section  UI-C), 

The  Standard  Transgeneration  Cards  for  the  three  examples  on  pages 
1 6 and  1 7 are: 


(i)  TRNGENOOl  100012.  0000 
TRNGEN003100032.  0000 
TRNGEN0051  1001000003 
TRNGEN0050 1005000000 


-1-65 
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(ii)  TRNGENOOl  lOOOlZ.  0000 
TRNGEN00109001  -0.  500 
TRNGENOOl  0400 1000000 

(iii)  TRNGEN004I2004000003 
TRNGEN00408004100.  00 
TRNGEN00403004000000 
TRNGEN0041  100400000Z 

2.  Special  Transgeneration  Card 

This  card  is  used  only  in  those  programs  which  require 
transformations  of  the  form  f(Y)  — >■  Y.  One  "Special”  Trans- 
generation Card  will  specify  successive  transformations. 

Card  Preoaration 


Col.  1-6 
Col.  7 
Col.  8,9 
Col.  10-15 

Col.  16,  17 
Col.  18-23 


SPECTG  (Mandatory) 

Number  of  transformations  (<  8 for  each  card) 

Code=i*  for  the  1st  transgeneration 

Constant=J'=''  for  the  1st  transgeneration  (if  none, 
leave  blank) 

Code=l*  for  the  2nd  transgeneration 
Constant=l'-'!'  for  the  2nd  transgeneration 


Col.  64,  65  Code*  for  the  8th  transgeneration 
Col.  66-71  Constant**  for  the  8th  transgeneration 
As  an  example,  if  the  user  desires  the  transformation 
( 3 + 100)^— *>Y 

he  could  accomplish  this  with  the  Special  Transgeneration  Card 
SPECTG5102.  000008-3.  OOOOIOOOOOOOSIOO.  00102.  0000 


* Code  must  be  one  of  the  set  of  permissible  codes  specified  in 
the  individual  program  description. 

**  Keypunch  decimal  point.  The  decimal  point  is  not  required  for 
right-justified  integers. 
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